Re: Strange Windows problem, lock_timeout test request

Lists: pgsql-hackers
From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Craig Ringer <craig(at)2ndQuadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Strange Windows problem, lock_timeout test request
Date: 2013-01-18 22:43:15
Message-ID: 50F9D003.6000805@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

using the MinGW cross-compiled PostgreSQL binaries from Fedora 18,
I get the following error for both 32 and 64-bit compiled executables.
listen_addresses = '*' and "trust" authentication was set for both.
The firewall was disabled for the tests and the server logs "incomplete startup packet".
The 64-bit version was compiled with my lock_timeout patch, the 32-bit
was without.

64-bit, connect to localhost:

C:\Users\Ákos\Desktop\PG93>bin\psql
psql: could not connect to server: Operation would block (0x00002733/10035)
Is the server running on host "localhost" (::1) and accepting
TCP/IP connections on port 5432?
could not connect to server: Operation would block (0x00002733/10035)
Is the server running on host "localhost" (127.0.0.1) and accepting
TCP/IP connections on port 5432?

64-bit, connect to own IP:

C:\Users\Ákos\Desktop\PG93>bin\psql -h 192.168.1.4
psql: could not connect to server: Operation would block (0x00002733/10035)
Is the server running on host "192.168.1.4" and accepting
TCP/IP connections on port 5432?

32-bit, connect to own IP:

C:\Users\Ákos\Desktop\PG93-32>bin\psql -h 192.168.1.4
psql: could not connect to server: Operation would block (0x00002733/10035)
Is the server running on host "192.168.1.4" and accepting
TCP/IP connections on port 5432?

Does it ring a bell for someone?

Unfortunately, I won't have time to do anything with my lock_timeout patch
for about 3 weeks. Does anyone have a little spare time to test it on Windows?
The patch is here, it still applies to HEAD without rejects or fuzz:
http://www.postgresql.org/message-id/506C0854.7090008@cybertec.at

Thanks in advance,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndQuadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-19 00:13:07
Message-ID: 50F9E513.1060407@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/18/2013 05:43 PM, Boszormenyi Zoltan wrote:
> Hi,
>
> using the MinGW cross-compiled PostgreSQL binaries from Fedora 18,
> I get the following error for both 32 and 64-bit compiled executables.
> listen_addresses = '*' and "trust" authentication was set for both.
> The firewall was disabled for the tests and the server logs
> "incomplete startup packet".
> The 64-bit version was compiled with my lock_timeout patch, the 32-bit
> was without.
>
> 64-bit, connect to localhost:
>
> C:\Users\Ákos\Desktop\PG93>bin\psql
> psql: could not connect to server: Operation would block
> (0x00002733/10035)
> Is the server running on host "localhost" (::1) and accepting
> TCP/IP connections on port 5432?
> could not connect to server: Operation would block (0x00002733/10035)
> Is the server running on host "localhost" (127.0.0.1) and
> accepting
> TCP/IP connections on port 5432?
>
> 64-bit, connect to own IP:
>
> C:\Users\Ákos\Desktop\PG93>bin\psql -h 192.168.1.4
> psql: could not connect to server: Operation would block
> (0x00002733/10035)
> Is the server running on host "192.168.1.4" and accepting
> TCP/IP connections on port 5432?
>
> 32-bit, connect to own IP:
>
> C:\Users\Ákos\Desktop\PG93-32>bin\psql -h 192.168.1.4
> psql: could not connect to server: Operation would block
> (0x00002733/10035)
> Is the server running on host "192.168.1.4" and accepting
> TCP/IP connections on port 5432?
>
> Does it ring a bell for someone?
>
> Unfortunately, I won't have time to do anything with my lock_timeout
> patch
> for about 3 weeks. Does anyone have a little spare time to test it on
> Windows?
> The patch is here, it still applies to HEAD without rejects or fuzz:
> http://www.postgresql.org/message-id/506C0854.7090008@cybertec.at

Yes it rings a bell. See
<http://people.planetpostgresql.org/andrew/index.php?/archives/264-Cross-compiling-PostgreSQL-for-WIndows.html>

Cross-compiling is not really a supported platform. Why don't you just
build natively? This is know to work as shown by the buildfarm animals
doing it successfully.

cheers

andrew


From: Amit kapila <amit(dot)kapila(at)huawei(dot)com>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Craig Ringer <craig(at)2ndQuadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-19 05:52:41
Message-ID: 6C0B27F7206C9E4CA54AE035729E9C383BEB8CB5@szxeml509-mbx
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Saturday, January 19, 2013 4:13 AM Boszormenyi Zoltan wrote:
> Hi,

> Unfortunately, I won't have time to do anything with my lock_timeout patch
> for about 3 weeks. Does anyone have a little spare time to test it on Windows?

I shall try to do it, probably next week.
Others are also welcome to test the patch.

With Regards,
Amit Kapila.


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndQuadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-19 07:36:04
Message-ID: 50FA4CE4.9000301@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-01-19 01:13 keltezéssel, Andrew Dunstan írta:
>
> On 01/18/2013 05:43 PM, Boszormenyi Zoltan wrote:
>> Hi,
>>
>> using the MinGW cross-compiled PostgreSQL binaries from Fedora 18,
>> I get the following error for both 32 and 64-bit compiled executables.
>> listen_addresses = '*' and "trust" authentication was set for both.
>> The firewall was disabled for the tests and the server logs "incomplete startup packet".
>> The 64-bit version was compiled with my lock_timeout patch, the 32-bit
>> was without.
>>
>> 64-bit, connect to localhost:
>>
>> C:\Users\Ákos\Desktop\PG93>bin\psql
>> psql: could not connect to server: Operation would block (0x00002733/10035)
>> Is the server running on host "localhost" (::1) and accepting
>> TCP/IP connections on port 5432?
>> could not connect to server: Operation would block (0x00002733/10035)
>> Is the server running on host "localhost" (127.0.0.1) and accepting
>> TCP/IP connections on port 5432?
>>
>> 64-bit, connect to own IP:
>>
>> C:\Users\Ákos\Desktop\PG93>bin\psql -h 192.168.1.4
>> psql: could not connect to server: Operation would block (0x00002733/10035)
>> Is the server running on host "192.168.1.4" and accepting
>> TCP/IP connections on port 5432?
>>
>> 32-bit, connect to own IP:
>>
>> C:\Users\Ákos\Desktop\PG93-32>bin\psql -h 192.168.1.4
>> psql: could not connect to server: Operation would block (0x00002733/10035)
>> Is the server running on host "192.168.1.4" and accepting
>> TCP/IP connections on port 5432?
>>
>> Does it ring a bell for someone?
>>
>> Unfortunately, I won't have time to do anything with my lock_timeout patch
>> for about 3 weeks. Does anyone have a little spare time to test it on Windows?
>> The patch is here, it still applies to HEAD without rejects or fuzz:
>> http://www.postgresql.org/message-id/506C0854.7090008@cybertec.at
>
> Yes it rings a bell. See
> <http://people.planetpostgresql.org/andrew/index.php?/archives/264-Cross-compiling-PostgreSQL-for-WIndows.html>

The strange thing is that I have used cross-compiled build
during Fedora 16 prime time (using the mingw-w64 test repository) and
early Fedora 17 using the mingw32-* and ming64-* GCC packages
that were pulled from the above mentioned repository. It was the last time
I tested PG on Windows and it worked. I could connect to it both locally
and from my Linux PC.

>
> Cross-compiling is not really a supported platform. Why don't you just build natively?
> This is know to work as shown by the buildfarm animals doing it successfully.

Because I don't have a mingw setup on Windows. (Sorry.)

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndQuadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-19 07:36:48
Message-ID: 50FA4D10.6080602@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-01-19 06:52 keltezéssel, Amit kapila írta:
> On Saturday, January 19, 2013 4:13 AM Boszormenyi Zoltan wrote:
>> Hi,
>
>
>> Unfortunately, I won't have time to do anything with my lock_timeout patch
>> for about 3 weeks. Does anyone have a little spare time to test it on Windows?
> I shall try to do it, probably next week.
> Others are also welcome to test the patch.

Thanks very much in advance.

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndQuadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-19 07:51:00
Message-ID: 50FA5064.6040807@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-01-19 01:13 keltezéssel, Andrew Dunstan írta:
>
> On 01/18/2013 05:43 PM, Boszormenyi Zoltan wrote:
>> Hi,
>>
>> using the MinGW cross-compiled PostgreSQL binaries from Fedora 18,
>> I get the following error for both 32 and 64-bit compiled executables.
>> listen_addresses = '*' and "trust" authentication was set for both.
>> The firewall was disabled for the tests and the server logs "incomplete startup packet".
>> The 64-bit version was compiled with my lock_timeout patch, the 32-bit
>> was without.
>>
>> 64-bit, connect to localhost:
>>
>> C:\Users\Ákos\Desktop\PG93>bin\psql
>> psql: could not connect to server: Operation would block (0x00002733/10035)
>> Is the server running on host "localhost" (::1) and accepting
>> TCP/IP connections on port 5432?
>> could not connect to server: Operation would block (0x00002733/10035)
>> Is the server running on host "localhost" (127.0.0.1) and accepting
>> TCP/IP connections on port 5432?
>>
>> 64-bit, connect to own IP:
>>
>> C:\Users\Ákos\Desktop\PG93>bin\psql -h 192.168.1.4
>> psql: could not connect to server: Operation would block (0x00002733/10035)
>> Is the server running on host "192.168.1.4" and accepting
>> TCP/IP connections on port 5432?
>>
>> 32-bit, connect to own IP:
>>
>> C:\Users\Ákos\Desktop\PG93-32>bin\psql -h 192.168.1.4
>> psql: could not connect to server: Operation would block (0x00002733/10035)
>> Is the server running on host "192.168.1.4" and accepting
>> TCP/IP connections on port 5432?
>>
>> Does it ring a bell for someone?
>>
>> Unfortunately, I won't have time to do anything with my lock_timeout patch
>> for about 3 weeks. Does anyone have a little spare time to test it on Windows?
>> The patch is here, it still applies to HEAD without rejects or fuzz:
>> http://www.postgresql.org/message-id/506C0854.7090008@cybertec.at
>
> Yes it rings a bell. See
> <http://people.planetpostgresql.org/andrew/index.php?/archives/264-Cross-compiling-PostgreSQL-for-WIndows.html>

I wanted to add a comment to this blog entry but it wasn't accepted.
Here it is:

It doesn't work under Wine, see:
http://www.winehq.org/pipermail/wine-users/2013-January/107008.html

But pg_config works so other PostgreSQL clients can also be built using the cross compiler.

>
> Cross-compiling is not really a supported platform. Why don't you just build natively?
> This is know to work as shown by the buildfarm animals doing it successfully.
>
> cheers
>
> andrew
>
>
>
>

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndQuadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-19 20:15:31
Message-ID: 50FAFEE3.5040006@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/19/2013 02:36 AM, Boszormenyi Zoltan wrote:
>
>>
>> Cross-compiling is not really a supported platform. Why don't you
>> just build natively? This is know to work as shown by the buildfarm
>> animals doing it successfully.
>
> Because I don't have a mingw setup on Windows. (Sorry.)
>

A long time ago I had a lot of sympathy with this answer, but these days
not so much. Getting a working mingw/msys environment sufficient to
build a bare bones PostgreSQL from scratch is both cheap and fairly
easy. The improvements that mingw has made in its install process, and
the presence of cheap or free windows instances in the cloud combine to
make this pretty simple. But since it's still slightly involved here is
how I constructed one such this morning:

* Create an Amazon t1.micro spot instance of
Windows_Server-2008-SP2-English-64Bit-Base-2012.12.12 (ami-554ac83c)
(current price $0.006 / hour)
* get the credentials and log in using:
o rdesktop -g 80% -u Administrator -p 'password' amazon-hostname
* turn off annoying IE security enhancements, and fire up IE
* go to
<http://sourceforge.net/projects/mingw/files/Installer/mingw-get-inst/>
and download latest mingw-get-inst
* run this - make sure to select the Msys and the developer toolkit in
addition to the C compiler
* navigate in explorer or a command window to \Mingw\Msys\1.0 and run
msys.bat
* run this command to install some useful packages:
o mingw-get install msys-wget msys-rxvt msys-unzip
* close that window
* open a normal command window and run the following to create an
unprivileged user and open an msys window as that user:
o net user pgrunner SomePw1234 /add
o runas /user:pgrunner "cmd /c \mingw\msys\1.0\msys.bat --rxvt"
* in the rxvt window run:
o wget
http://ftp.postgresql.org/pub/snapshot/dev/postgresql-snapshot.tar.gz
o tar -z -xf postgresql-snapshot.tar.gz
o wget
"http://sourceforge.net/projects/mingw-w64/files/Toolchains%20targetting%20Win64/Automated%20Builds/mingw-w64-bin_i686-mingw_20111220.zip/download"
o mkdir /mingw64
o cd /mingw64
o unzip ~/mingw-w64-bin_i686-mingw_20111220.zip
o cd ~/postgresql-9.3devel
o export PATH=/mingw64/bin:$PATH
o ./configure --host=x86_64-w64-mingw32 --without-zlib && make &&
make check
+ ( make some coffee and do the crossword or read War and
Peace - this can take a while)

Of course, for something faster you would pick a rather more expensive
instance than t1.micro (say, m1.large, which usually has a spot price of
$0.03 to $0.06 per hour), and if you're going to do this a lot you'll
stash your stuff on an EBS volume that you can remount as needed, or
zip it up and put it on S3, and then just pull it and unpack it in one
hit from there. And there's barely enough room left on the root file
system to do what's above anyway. But you can get the idea from this.
Note that we no longer need to look elsewhere for extra tools like flex
and bison as we once did - the ones that come with modern Msys should
work just fine.

If you want more build features (openssl, libxml, libxslt, perl, python
etc) then there is extra work to do in getting hold of those. But then
cross-compiling for those things might not be easy either.

Somebody more adept at automating Amazon than I am should be able to
automate most or all of this.

Anyway that should be just about enough info for just about any
competent developer to get going, even if they have never touched
Windows. (No doubt one or two people will want to quibble with what I've
done. That's fine - this is a description, not a prescription.)

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndQuadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-19 23:15:39
Message-ID: 50FB291B.8090104@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/19/2013 02:51 AM, Boszormenyi Zoltan wrote:
>> Yes it rings a bell. See
>> <http://people.planetpostgresql.org/andrew/index.php?/archives/264-Cross-compiling-PostgreSQL-for-WIndows.html>
>
>
> I wanted to add a comment to this blog entry but it wasn't accepted.

The blog is closed for comments. I have moved to a new blog, and this is
just there for archive purposes.

> Here it is:
>
> It doesn't work under Wine, see:
> http://www.winehq.org/pipermail/wine-users/2013-January/107008.html
>
> But pg_config works so other PostgreSQL clients can also be built
> using the cross compiler.
>

If you want to target Wine I think you're totally on your own.

cheers

andrew


From: Zoltán Böszörményi <zb(at)cybertec(dot)at>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndQuadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-23 13:28:25
Message-ID: 50FFE579.3000709@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-01-20 00:15 keltezéssel, Andrew Dunstan írta:
>
> On 01/19/2013 02:51 AM, Boszormenyi Zoltan wrote:
>>> Yes it rings a bell. See
>>> <http://people.planetpostgresql.org/andrew/index.php?/archives/264-Cross-compiling-PostgreSQL-for-WIndows.html>
>>
>>
>> I wanted to add a comment to this blog entry but it wasn't accepted.
>
>
> The blog is closed for comments. I have moved to a new blog, and this
> is just there for archive purposes.
>
>
>> Here it is:
>>
>> It doesn't work under Wine, see:
>> http://www.winehq.org/pipermail/wine-users/2013-January/107008.html
>>
>> But pg_config works so other PostgreSQL clients can also be built
>> using the cross compiler.
>>
>
>
> If you want to target Wine I think you're totally on your own.

Yes, I know, it was only an attempt. The user's
administrator privilege under Wine is a showstopper.

>
> cheers
>
> andrew
>
>
>

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Zoltán Böszörményi <zb(at)cybertec(dot)at>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndQuadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-23 13:31:06
Message-ID: 50FFE61A.7050106@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-01-19 21:15 keltezéssel, Andrew Dunstan írta:
>
> On 01/19/2013 02:36 AM, Boszormenyi Zoltan wrote:
>>
>>>
>>> Cross-compiling is not really a supported platform. Why don't you
>>> just build natively? This is know to work as shown by the buildfarm
>>> animals doing it successfully.
>>
>> Because I don't have a mingw setup on Windows. (Sorry.)
>>
>
> A long time ago I had a lot of sympathy with this answer, but these
> days not so much.

I didn't ask for sympathy. :-) I am just on a totally different
project until 9th February and I am also far away from my desk.
So I can't even attempt to install Windows+mingw inside Qemu/KVM.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-24 18:44:09
Message-ID: CAMkU=1wQEvU1DFZY5aMTTkjaHqcJArAZX8nES5zf977oOXLs5g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Jan 19, 2013 at 12:15 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> On 01/19/2013 02:36 AM, Boszormenyi Zoltan wrote:
>>
>
> A long time ago I had a lot of sympathy with this answer, but these days not
> so much. Getting a working mingw/msys environment sufficient to build a bare
> bones PostgreSQL from scratch is both cheap and fairly easy. The
> improvements that mingw has made in its install process, and the presence of
> cheap or free windows instances in the cloud combine to make this pretty
> simple. But since it's still slightly involved here is how I constructed
> one such this morning:

I've used this description, skipping the Amazon part and putting it
directly on my Windows computer, and it worked.

Except bin/pg_ctl does not work. It just silently exits without doing
anything, so I have to use bin/postgres to start the database (which
is what "make check" uses anyway, so not a problem if you just want
make check). Is that just me or is that a known problem? I've seen
some discussion from 2004, but didn't find a conclusion.

What advantages does mingw have over MSVC? Is it just that it is
cost-free, or is it easier to use mingw than MSVC for someone used to
building on Linux? (mingw certainly does not seem to have the
advantage of being fast!)

Would you like to put this somewhere on wiki.postgresql.org, or would
you mind if I do so?

Thanks for the primer,

Jeff


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-24 19:41:18
Message-ID: 51018E5E.8020005@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/24/2013 01:44 PM, Jeff Janes wrote:
> On Sat, Jan 19, 2013 at 12:15 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>> On 01/19/2013 02:36 AM, Boszormenyi Zoltan wrote:
>> A long time ago I had a lot of sympathy with this answer, but these days not
>> so much. Getting a working mingw/msys environment sufficient to build a bare
>> bones PostgreSQL from scratch is both cheap and fairly easy. The
>> improvements that mingw has made in its install process, and the presence of
>> cheap or free windows instances in the cloud combine to make this pretty
>> simple. But since it's still slightly involved here is how I constructed
>> one such this morning:
> I've used this description, skipping the Amazon part and putting it
> directly on my Windows computer, and it worked.
>
> Except bin/pg_ctl does not work. It just silently exits without doing
> anything, so I have to use bin/postgres to start the database (which
> is what "make check" uses anyway, so not a problem if you just want
> make check). Is that just me or is that a known problem? I've seen
> some discussion from 2004, but didn't find a conclusion.

Did you copy libpq.dll from the lib directory to the bin directory? If
not, try that and see if it fixes the problem.

>
> What advantages does mingw have over MSVC? Is it just that it is
> cost-free, or is it easier to use mingw than MSVC for someone used to
> building on Linux? (mingw certainly does not seem to have the
> advantage of being fast!)

See Craig's email today about problems with SDKs and availabilty of

>
> Would you like to put this somewhere on wiki.postgresql.org, or would
> you mind if I do so?
>
> Thanks for the primer,
>
> Jeff
>


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-24 20:08:14
Message-ID: 510194AE.8040308@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/24/2013 02:41 PM, Andrew Dunstan wrote:
>
>>
>> What advantages does mingw have over MSVC? Is it just that it is
>> cost-free, or is it easier to use mingw than MSVC for someone used to
>> building on Linux? (mingw certainly does not seem to have the
>> advantage of being fast!)
>
> See Craig's email today about problems with SDKs and availabilty of

Not sure what happened there.

... availability of free 64 bit MSVC compilers.

Also, some third party libraries are built with the Mingw compilers and
it's often best not to switch if you can help it.

>
>>
>> Would you like to put this somewhere on wiki.postgresql.org, or would
>> you mind if I do so?

Please go for it.

cheers

andrew


From: Craig Ringer <craig(at)2ndQuadrant(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-25 00:55:17
Message-ID: 5101D7F5.1050304@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/25/2013 02:44 AM, Jeff Janes wrote:
>
> What advantages does mingw have over MSVC? Is it just that it is
> cost-free, or is it easier to use mingw than MSVC for someone used to
> building on Linux? (mingw certainly does not seem to have the
> advantage of being fast!)
As far as I'm concerned, the only advantages of MinGW are:

- Mostly compatible with the autotools/gmake toolchain via msys; and
- Not subject to Microsoft's whims regarding zero-cost access to toolchains

I don't particularly like autotools and think that using it on Windows
is pretty ugly. We need msys anyway because our build process uses bison
and flex and at the present time there are quality native Windows ports
of these tools, but there's no fundamental reason they couldn't run
without msys.

The greater appeal is that MinGW means that we're not totally dependent
on Microsoft's whims when it comes to free access to the toolchain. They
have a spotty history there.

There was no free access to Visual Studio compilers until 2004, with the
release of vctoolkit2003 (which has now vanished). They then added
compilers to the Windows SDK starting in (IIRC) 6.0a, through to 7.1
(Windows 7 and .NET 4.5). Those compilers have been removed from the SDK
as of the Windows 8 SDK. The other way to get Microsoft compilers is
from Visual Studio. This usually expensive product was released as a
free-to-use Express version with VC Express 2005. It was a dog of a
thing that required manual registry hacking and/or environment setup and
the manual addition of an SDK to make work. VC Express 2008 provided an
integrated installer, but also removed some debugging facilities. VC
Express 2010 removed more debugging features. Then with VC 2012
Microsoft announced that there would be no 2012 Express edition and that
everybody should convert their apps to the new Metro system for which
free a toolchain would be available. They went back on that under
community pressure and released VC Express 2012, which has many of the
debugging features restored and works well - but who knows if it'll see
any further updates.

All the Visual Studio Express editions require activation to use them,
though this activation is free. Once you have an activation code it's
valid forever, but there's no guarantee the servers will remain
available to generate new codes, so you'd better keep them written down.

Microsoft don't maintain old SDKs and VC Express editions much, if at
all, so installing and using older SDKs gets more complicated over time.
Take VC Express 2010, until recently the newest edition available:

- Uninstall any Visual Studio 2010 redists that're installed, taking
note of the versions
- Install VC Express 2010
- Install VC Express 2010 SP1
- Install VC Express 2010 SP1 Compiler Pack
- Download and reinstall any newer redists that you uninstalled, so your
apps keep working.

.... and it looks like having VC Express 2010 installed might partly
break VC 2012; I'm still unsure of exactly what caused that.

We can not fix any of these problems, nor can we update the toolsets to
run on newer Windows versions. So if we support only Visual Studio,
we're extremely dependent on the uncertain future of the free Microsoft
toolchain.

For that reason, even though it's not a great build environment and on
Windows not a great compiler, MinGW support is important.

If you notice issues with the MinGW builds, please report them so
regression test coverage can be improved. Hardly anyone uses them but
like the fire department, one day if you stop supporting them you really
regret it.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Craig Ringer <craig(at)2ndQuadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-25 01:44:28
Message-ID: 5101E37C.20901@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/25/2013 04:08 AM, Andrew Dunstan wrote:
>
> On 01/24/2013 02:41 PM, Andrew Dunstan wrote:
>>
>>>
>>> What advantages does mingw have over MSVC? Is it just that it is
>>> cost-free, or is it easier to use mingw than MSVC for someone used to
>>> building on Linux? (mingw certainly does not seem to have the
>>> advantage of being fast!)
>>
>> See Craig's email today about problems with SDKs and availabilty of
>> free 64 bit MSVC compilers.
>
> Also, some third party libraries are built with the Mingw compilers
> and it's often best not to switch if you can help it.
That's a trade-off; they're often not available in 64-bit, so you
usually land up needing to rebuild them for 64-bit anyway, and the mingw
64 bit toolchain seems to be a bit broken.

As for 64-bit MS compilers - at the moment the current MSVC 64-bit
compilers are available for free as cross-compilers. The rather bizarre
situation is that you have a 64-bit host OS running 32-bit compilers
under Wow64, producing 64-bit binaries as output. This seems to be quite
transparent to the build process when on a 64-bit host, as the host can
execute the cross-compiled binaries directly, so it isn't like a normal
cross-compile where you have to differentiate between host- and target-
executables.

So, while no native 64-bit compilers are available for free as part of
Visual Studio Express 2012 (11), it doesn't matter much. The issue is
more that it's very much Microsoft's whim whether they release compilers
at all and if so, which ones, when and how.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: james <james(at)mansionfamily(dot)plus(dot)com>
To: Craig Ringer <craig(at)2ndQuadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-26 16:04:35
Message-ID: 5103FE93.2020306@mansionfamily.plus.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> So, while no native 64-bit compilers are available for free as part of
> Visual Studio Express 2012 (11), it doesn't matter much. The issue is
> more that it's very much Microsoft's whim whether they release compilers
> at all and if so, which ones, when and how.

I think I have a pretty vanilla Visual Studio Express 2012 for Desktop and:

C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\x86_amd64>.\cl
Microsoft (R) C/C++ Optimizing Compiler Version 17.00.51106.1 for x64
Copyright (C) Microsoft Corporation. All rights reserved.

usage: cl [ option... ] filename... [ /link linkoption... ]

C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\x86_amd64>

Am I misunderstanding the discussion here?

Isn't that the 64-bit tool suite?

Does anyone care if the compiler is a 64 bit binary itself?


From: Craig Ringer <craig(at)2ndQuadrant(dot)com>
To: james(at)mansionfamily(dot)plus(dot)com
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-26 23:38:11
Message-ID: 510468E3.9000709@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/27/2013 12:04 AM, james wrote:
>> So, while no native 64-bit compilers are available for free as part of
>> Visual Studio Express 2012 (11), it doesn't matter much. The issue is
>> more that it's very much Microsoft's whim whether they release compilers
>> at all and if so, which ones, when and how.
>
> I think I have a pretty vanilla Visual Studio Express 2012 for Desktop
> and:
>
> C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\x86_amd64>.\cl
> Microsoft (R) C/C++ Optimizing Compiler Version 17.00.51106.1 for x64
> Copyright (C) Microsoft Corporation. All rights reserved.
>
> usage: cl [ option... ] filename... [ /link linkoption... ]
>
> C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\x86_amd64>
>
> Am I misunderstanding the discussion here?
Yep, that's a cross-compiler, hence "x86_amd64". As I said, it doesn't
actually matter very much on an x64 host as the target arch binaries run
fine on the host arch. The only reason you'd need the native 64-bit
compiler is if you're compiling huge programs that cause the 32-bit to
64-bit cross-compiler to run out of memory. Not an issue for Pg.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>
To: "'Boszormenyi Zoltan'" <zb(at)cybertec(dot)at>
Cc: "'Craig Ringer'" <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Amit kapila'" <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-28 14:20:33
Message-ID: 00fd01cdfd62$a2e150b0$e8a3f210$@kommi@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Saturday, January 19, 2013 11:23 AM Amit kapila wrote:
>On Saturday, January 19, 2013 4:13 AM Boszormenyi Zoltan wrote:
>> Hi,
>>
>> Unfortunately, I won't have time to do anything with my lock_timeout
patch
>> for about 3 weeks. Does anyone have a little spare time to test it on
Windows?
>
>I shall try to do it, probably next week.
>Others are also welcome to test the patch.

I have carried out some windows testing of the lock timeout patch.

The extra tests which are carried out on the patch are attached in the mail.

Some observations on the patch:

1. Patch needs a rebase as it causing some rejections.
2. regress check failed because the expected ".out" file is not updated
properly.

Regards,
Hari babu.

Attachment Content-Type Size
lock_timeout_test.sql application/octet-stream 2.3 KB

From: Zoltán Böszörményi <zb(at)cybertec(dot)at>
To: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>
Cc: 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-30 14:29:04
Message-ID: 51092E30.9040405@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-01-28 15:20 keltezéssel, Hari Babu írta:
> On Saturday, January 19, 2013 11:23 AM Amit kapila wrote:
>> On Saturday, January 19, 2013 4:13 AM Boszormenyi Zoltan wrote:
>>> Hi,
>>>
>>> Unfortunately, I won't have time to do anything with my lock_timeout
> patch
>>> for about 3 weeks. Does anyone have a little spare time to test it on
> Windows?
>> I shall try to do it, probably next week.
>> Others are also welcome to test the patch.
> I have carried out some windows testing of the lock timeout patch.

Thanks very much. It means it didn't crash for you and
it waited the expected amount of time as well.

> The extra tests which are carried out on the patch are attached in the mail.

The patch itself contained regression tests to run by itself
and compete with statement_timeout as well, although
it waits only 2 seconds instead of 60 as in your test.

> Some observations on the patch:
>
> 1. Patch needs a rebase as it causing some rejections.

On a fresh GIT pull, it only caused a reject for the documentation
parts of the patch. No other rejects and no fuzz, only line shift
warnings were indicated by the patch. Anyway, a refreshed one
is attached.

> 2. regress check failed because the expected ".out" file is not updated
> properly.

Which regress check failed? The .out file was updated in the patch
for prepared_xacts.sql where the regression tests for lock_timeout
were added. Or do you mean the one for the sql file you sent?

Thanks,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/

Attachment Content-Type Size
2-lock_timeout-v28.patch.gz application/x-gzip 11.8 KB

From: Zoltán Böszörményi <zb(at)cybertec(dot)at>
To: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>
Cc: 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-30 14:45:42
Message-ID: 51093216.3030405@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-01-30 15:29 keltezéssel, Zoltán Böszörményi írta:
> 2013-01-28 15:20 keltezéssel, Hari Babu írta:
>> On Saturday, January 19, 2013 11:23 AM Amit kapila wrote:
>>> On Saturday, January 19, 2013 4:13 AM Boszormenyi Zoltan wrote:
>>>> Hi,
>>>>
>>>> Unfortunately, I won't have time to do anything with my lock_timeout
>> patch
>>>> for about 3 weeks. Does anyone have a little spare time to test it on
>> Windows?
>>> I shall try to do it, probably next week.
>>> Others are also welcome to test the patch.
>> I have carried out some windows testing of the lock timeout patch.
>
> Thanks very much. It means it didn't crash for you and
> it waited the expected amount of time as well.
>
>> The extra tests which are carried out on the patch are attached in
>> the mail.
>
> The patch itself contained regression tests to run by itself
> and compete with statement_timeout as well, although
> it waits only 2 seconds instead of 60 as in your test.
>
>> Some observations on the patch:
>>
>> 1. Patch needs a rebase as it causing some rejections.
>
> On a fresh GIT pull, it only caused a reject for the documentation
> parts of the patch. No other rejects and no fuzz, only line shift
> warnings were indicated by the patch. Anyway, a refreshed one
> is attached.
>
>> 2. regress check failed because the expected ".out" file is not updated
>> properly.
>
> Which regress check failed? The .out file was updated in the patch
> for prepared_xacts.sql where the regression tests for lock_timeout
> were added. Or do you mean the one for the sql file you sent?

If the failed regression test is indeed the prepared_xacts.sql that is
in my patch, can you attach the regression.diff file after the failed
"make check"? Thanks.

>
> Thanks,
> Zoltán Böszörményi
>
>
>
>

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>
To: 'Zoltán Böszörményi' <zb(at)cybertec(dot)at>
Cc: "'Craig Ringer'" <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Amit kapila'" <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-30 15:06:36
Message-ID: 001401cdfefb$6714b870$353e2950$@kommi@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wednesday, January 30, 2013 7:59 PM Zoltán Böszörményi wrote:
>>2013-01-28 15:20 keltezéssel, Hari Babu írta:
>> 2. regress check failed because the expected ".out" file is not
>> updated properly.
>
>Which regress check failed? The .out file was updated in the patch for
prepared_xacts.sql where >the regression tests for lock_timeout were added.
Or do you mean the one for the sql file you >sent?

During regress test, "prepared_xacts_1.out" expected file used for comparing
with the result file. Which is not updated by the patch. Because of this
reason the regress check is failing.

Regards,
Hari babu.


From: Zoltán Böszörményi <zb(at)cybertec(dot)at>
To: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>
Cc: 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-30 16:45:54
Message-ID: 51094E42.3060100@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-01-30 16:06 keltezéssel, Hari Babu írta:
> On Wednesday, January 30, 2013 7:59 PM Zoltán Böszörményi wrote:
>>> 2013-01-28 15:20 keltezéssel, Hari Babu írta:
>>> 2. regress check failed because the expected ".out" file is not
>>> updated properly.
>> Which regress check failed? The .out file was updated in the patch for
> prepared_xacts.sql where >the regression tests for lock_timeout were added.
> Or do you mean the one for the sql file you >sent?
>
> During regress test, "prepared_xacts_1.out" expected file used for comparing
> with the result file. Which is not updated by the patch. Because of this
> reason the regress check is failing.

I see, so this is a Windows-only change that needs a different.
Can you send the resulting prepared_xacts_1.out file so I can integrate
its changes into my patch? That way it would be complete.
Thanks in advance.

>
> Regards,
> Hari babu.
>
>
>

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Zoltán Böszörményi <zb(at)cybertec(dot)at>
To: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>
Cc: 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-31 13:55:14
Message-ID: 510A77C2.1020706@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-01-30 17:45 keltezéssel, Zoltán Böszörményi írta:
> 2013-01-30 16:06 keltezéssel, Hari Babu írta:
>> On Wednesday, January 30, 2013 7:59 PM Zoltán Böszörményi wrote:
>>>> 2013-01-28 15:20 keltezéssel, Hari Babu írta:
>>>> 2. regress check failed because the expected ".out" file is not
>>>> updated properly.
>>> Which regress check failed? The .out file was updated in the patch for
>> prepared_xacts.sql where >the regression tests for lock_timeout were
>> added.
>> Or do you mean the one for the sql file you >sent?
>>
>> During regress test, "prepared_xacts_1.out" expected file used for
>> comparing
>> with the result file. Which is not updated by the patch. Because of this
>> reason the regress check is failing.
>
> I see, so this is a Windows-only change that needs a different.
> Can you send the resulting prepared_xacts_1.out file so I can integrate
> its changes into my patch? That way it would be complete.
> Thanks in advance.

I have found a little time to look into this problem and
found a way to make pg_regress use prepared_xacts_1.out.
I had to change line 2193 in pg_regress.c from

fputs("max_prepared_transactions = 2\n", pg_conf);

to

fputs("max_prepared_transactions = 0\n", pg_conf);

The patch now passed "make check" in both cases.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/

Attachment Content-Type Size
2-lock_timeout-v29.patch.gz application/x-gzip 11.9 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Zoltán Böszörményi <zb(at)cybertec(dot)at>
Cc: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-31 14:22:46
Message-ID: 20130131142246.GA4883@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Zoltán Böszörményi wrote:

> I have found a little time to look into this problem and
> found a way to make pg_regress use prepared_xacts_1.out.
> I had to change line 2193 in pg_regress.c from
>
> fputs("max_prepared_transactions = 2\n", pg_conf);
>
> to
>
> fputs("max_prepared_transactions = 0\n", pg_conf);
>
> The patch now passed "make check" in both cases.

That sounds a lot more difficult than just using "make installcheck" and
configure the running server with zero prepared xacts ...

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Zoltán Böszörményi <zb(at)cybertec(dot)at>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-31 14:55:46
Message-ID: 510A85F2.7090806@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-01-31 15:22 keltezéssel, Alvaro Herrera írta:
> Zoltán Böszörményi wrote:
>
>> I have found a little time to look into this problem and
>> found a way to make pg_regress use prepared_xacts_1.out.
>> I had to change line 2193 in pg_regress.c from
>>
>> fputs("max_prepared_transactions = 2\n", pg_conf);
>>
>> to
>>
>> fputs("max_prepared_transactions = 0\n", pg_conf);
>>
>> The patch now passed "make check" in both cases.
> That sounds a lot more difficult than just using "make installcheck" and
> configure the running server with zero prepared xacts ...

It didn't occur to me to use "make installcheck" for this one.

What is strange though is why prepared_xacts_1.out exists
at all, since pg_regress.c / make check seems to set
max_prepared_transactions on Windows, too.

Is there a way to specify such settings in REGRESS_OPTS?

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Zoltán Böszörményi <zb(at)cybertec(dot)at>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-31 15:38:04
Message-ID: 510A8FDC.2050700@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/31/2013 09:55 AM, Zoltán Böszörményi wrote:
> 2013-01-31 15:22 keltezéssel, Alvaro Herrera írta:
>> Zoltán Böszörményi wrote:
>>
>>> I have found a little time to look into this problem and
>>> found a way to make pg_regress use prepared_xacts_1.out.
>>> I had to change line 2193 in pg_regress.c from
>>>
>>> fputs("max_prepared_transactions = 2\n", pg_conf);
>>>
>>> to
>>>
>>> fputs("max_prepared_transactions = 0\n", pg_conf);
>>>
>>> The patch now passed "make check" in both cases.
>> That sounds a lot more difficult than just using "make installcheck" and
>> configure the running server with zero prepared xacts ...
>
> It didn't occur to me to use "make installcheck" for this one.
>
> What is strange though is why prepared_xacts_1.out exists
> at all, since pg_regress.c / make check seems to set
> max_prepared_transactions on Windows, too.
>
> Is there a way to specify such settings in REGRESS_OPTS?
>
>

You can use the temp_config option to specify a file with non-standard
settings that will be appended to the installed postgresql.conf.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Zoltán Böszörményi <zb(at)cybertec(dot)at>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, "'Craig Ringer'" <craig(at)2ndquadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Amit kapila'" <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-31 15:39:38
Message-ID: 5477.1359646778@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

=?ISO-8859-1?Q?Zolt=E1n_B=F6sz=F6rm=E9nyi?= <zb(at)cybertec(dot)at> writes:
> 2013-01-31 15:22 keltezssel, Alvaro Herrera rta:
>> That sounds a lot more difficult than just using "make installcheck" and
>> configure the running server with zero prepared xacts ...

> It didn't occur to me to use "make installcheck" for this one.

> What is strange though is why prepared_xacts_1.out exists
> at all, since pg_regress.c / make check seems to set
> max_prepared_transactions on Windows, too.

Alvaro told you why: so that the tests wouldn't report failure in
"make installcheck" against a stock-configuration server.

BTW, 99% of the time you can update alternative expected files by
applying the same patch to them as you did to the tested version.
At least that usually works for me, and it can be a lot easier than
arranging to duplicate the environment the alternative file is
meant for.

regards, tom lane


From: Zoltán Böszörményi <zb(at)cybertec(dot)at>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndquadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-31 17:27:22
Message-ID: 510AA97A.5030909@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-01-31 16:39 keltezéssel, Tom Lane írta:
> =?ISO-8859-1?Q?Zolt=E1n_B=F6sz=F6rm=E9nyi?= <zb(at)cybertec(dot)at> writes:
>> 2013-01-31 15:22 keltezéssel, Alvaro Herrera írta:
>>> That sounds a lot more difficult than just using "make installcheck" and
>>> configure the running server with zero prepared xacts ...
>> It didn't occur to me to use "make installcheck" for this one.
>> What is strange though is why prepared_xacts_1.out exists
>> at all, since pg_regress.c / make check seems to set
>> max_prepared_transactions on Windows, too.
> Alvaro told you why: so that the tests wouldn't report failure in
> "make installcheck" against a stock-configuration server.
>
> BTW, 99% of the time you can update alternative expected files by
> applying the same patch to them as you did to the tested version.
> At least that usually works for me, and it can be a lot easier than
> arranging to duplicate the environment the alternative file is
> meant for.
>
> regards, tom lane

Thanks. A question though: how does "make check" or "make installcheck"
chooses between the *.out and its different *_N.out incarnations?
I couldn't find traces of prepared_xacts_1.out in any file saying "this
is the one to be used in this-and-this" configuration. Does the procedure
check against all versions and the least different one is reported?

Thanks,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Zoltán Böszörményi <zb(at)cybertec(dot)at>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, "'Craig Ringer'" <craig(at)2ndquadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Amit kapila'" <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-31 18:38:07
Message-ID: 16962.1359657487@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

=?ISO-8859-2?Q?Zolt=E1n_B=F6sz=F6rm=E9nyi?= <zb(at)cybertec(dot)at> writes:
> Thanks. A question though: how does "make check" or "make installcheck"
> chooses between the *.out and its different *_N.out incarnations?
> I couldn't find traces of prepared_xacts_1.out in any file saying "this
> is the one to be used in this-and-this" configuration. Does the procedure
> check against all versions and the least different one is reported?

Exactly. This is documented, see the "regression tests" chapter of the
SGML manual.

regards, tom lane


From: Zoltán Böszörményi <zb(at)cybertec(dot)at>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndquadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-01-31 21:12:01
Message-ID: 510ADE21.3060709@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-01-31 19:38 keltezéssel, Tom Lane írta:
> =?ISO-8859-2?Q?Zolt=E1n_B=F6sz=F6rm=E9nyi?= <zb(at)cybertec(dot)at> writes:
>> Thanks. A question though: how does "make check" or "make installcheck"
>> chooses between the *.out and its different *_N.out incarnations?
>> I couldn't find traces of prepared_xacts_1.out in any file saying "this
>> is the one to be used in this-and-this" configuration. Does the procedure
>> check against all versions and the least different one is reported?
> Exactly. This is documented, see the "regression tests" chapter of the
> SGML manual.
>
> regards, tom lane

Thanks.

I tested my patch with installcheck and installcheck-parallel
using max_prepared_transactions=0 in the server and
it passed that way too.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: 'Craig Ringer' <craig(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-12 18:32:48
Message-ID: 511A8AD0.90501@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-01-31 22:12 keltezéssel, Zoltán Böszörményi írta:
> 2013-01-31 19:38 keltezéssel, Tom Lane írta:
>> =?ISO-8859-2?Q?Zolt=E1n_B=F6sz=F6rm=E9nyi?= <zb(at)cybertec(dot)at> writes:
>>> Thanks. A question though: how does "make check" or "make installcheck"
>>> chooses between the *.out and its different *_N.out incarnations?
>>> I couldn't find traces of prepared_xacts_1.out in any file saying "this
>>> is the one to be used in this-and-this" configuration. Does the procedure
>>> check against all versions and the least different one is reported?
>> Exactly. This is documented, see the "regression tests" chapter of the
>> SGML manual.
>>
>> regards, tom lane
>
> Thanks.
>
> I tested my patch with installcheck and installcheck-parallel
> using max_prepared_transactions=0 in the server and
> it passed that way too.

# ping -c 3 craig.ringer
PING craig.ringer (192.168.1.80) 56(84) bytes of data.
From craig.ringer (192.168.1.80) icmp_seq=1 Destination Host Unreachable
From craig.ringer (192.168.1.80) icmp_seq=2 Destination Host Unreachable
From craig.ringer (192.168.1.80) icmp_seq=3 Destination Host Unreachable

--- craig.ringer ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 1999ms

All details were fixed:
- rebased to recent GIT
- updated regression test for max_prepared_transactions=0
- tested on Windows (by Hari Babu, thanks!)

Why is this patch still in "Waiting on Author" state in the CF app?

Thanks in advance,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: "'Craig Ringer'" <craig(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Amit kapila'" <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-12 18:35:17
Message-ID: 25512.1360694117@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
> Why is this patch still in "Waiting on Author" state in the CF app?

If you're the author, you should change it back to "Needs Review" when
you've responded to the review.

regards, tom lane


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: 'Craig Ringer' <craig(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-12 18:41:42
Message-ID: 511A8CE6.4080701@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-02-12 19:35 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
>> Why is this patch still in "Waiting on Author" state in the CF app?
> If you're the author, you should change it back to "Needs Review" when
> you've responded to the review.
>
> regards, tom lane

Maybe it's just me, but I can't see any facility (link or button) to do that.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: "'Craig Ringer'" <craig(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Amit kapila'" <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-12 18:45:50
Message-ID: 25736.1360694750@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
> 2013-02-12 19:35 keltezssel, Tom Lane rta:
>> If you're the author, you should change it back to "Needs Review" when
>> you've responded to the review.

> Maybe it's just me, but I can't see any facility (link or button) to do that.

Go to the patch page, click "edit patch", adjust the popdown list for
"patch status" ...

regards, tom lane


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: 'Craig Ringer' <craig(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-12 18:51:45
Message-ID: 511A8F41.1030008@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-02-12 19:45 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
>> 2013-02-12 19:35 keltezéssel, Tom Lane írta:
>>> If you're the author, you should change it back to "Needs Review" when
>>> you've responded to the review.
>> Maybe it's just me, but I can't see any facility (link or button) to do that.
> Go to the patch page, click "edit patch", adjust the popdown list for
> "patch status" ...
>
> regards, tom lane

This is what I am saying, I am logged in and there is no popdown list
for the "Patch Status", only text...

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'Craig Ringer' <craig(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-12 18:55:37
Message-ID: 20130212185537.GB14018@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-02-12 19:51:45 +0100, Boszormenyi Zoltan wrote:
> 2013-02-12 19:45 keltezéssel, Tom Lane írta:
> >Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
> >>2013-02-12 19:35 keltezéssel, Tom Lane írta:
> >>>If you're the author, you should change it back to "Needs Review" when
> >>>you've responded to the review.
> >>Maybe it's just me, but I can't see any facility (link or button) to do that.
> >Go to the patch page, click "edit patch", adjust the popdown list for
> >"patch status" ...

> This is what I am saying, I am logged in and there is no popdown list
> for the "Patch Status", only text...

Thats strange. I changed it for now just to see if there's any
problem, worked fine for me.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: "'Craig Ringer'" <craig(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Amit kapila'" <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-12 18:56:23
Message-ID: 25993.1360695383@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
> 2013-02-12 19:45 keltezssel, Tom Lane rta:
>> Go to the patch page, click "edit patch", adjust the popdown list for
>> "patch status" ...

> This is what I am saying, I am logged in and there is no popdown list
> for the "Patch Status", only text...

That sure sounds like the
https://commitfest.postgresql.org/action/patch_view?id=896
page, not the edit-patch page which is
https://commitfest.postgresql.org/action/patch_form?id=896

There should be popdown lists for both topic and status ...

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'Craig Ringer' <craig(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-12 18:58:12
Message-ID: 511A90C4.7080001@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/12/2013 01:51 PM, Boszormenyi Zoltan wrote:
> 2013-02-12 19:45 keltezéssel, Tom Lane írta:
>> Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
>>> 2013-02-12 19:35 keltezéssel, Tom Lane írta:
>>>> If you're the author, you should change it back to "Needs Review" when
>>>> you've responded to the review.
>>> Maybe it's just me, but I can't see any facility (link or button) to
>>> do that.
>> Go to the patch page, click "edit patch", adjust the popdown list for
>> "patch status" ...
>>
>> regards, tom lane
>
> This is what I am saying, I am logged in and there is no popdown list
> for the "Patch Status", only text...
>
>

I bet you didn't go to "edit patch" like Tom said to.

cheers

andrew


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'Craig Ringer' <craig(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-12 19:00:13
Message-ID: 511A913D.8080404@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-02-12 19:58 keltezéssel, Andrew Dunstan írta:
>
> On 02/12/2013 01:51 PM, Boszormenyi Zoltan wrote:
>> 2013-02-12 19:45 keltezéssel, Tom Lane írta:
>>> Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
>>>> 2013-02-12 19:35 keltezéssel, Tom Lane írta:
>>>>> If you're the author, you should change it back to "Needs Review" when
>>>>> you've responded to the review.
>>>> Maybe it's just me, but I can't see any facility (link or button) to do that.
>>> Go to the patch page, click "edit patch", adjust the popdown list for
>>> "patch status" ...
>>>
>>> regards, tom lane
>>
>> This is what I am saying, I am logged in and there is no popdown list
>> for the "Patch Status", only text...
>>
>>
>
> I bet you didn't go to "edit patch" like Tom said to.

Yes, that was the problem. I missed that link. Thanks.
Where can I get brown paper bags in bulk? :-D

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Zoltán Böszörményi <zb(at)cybertec(dot)at>
Cc: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-23 01:55:23
Message-ID: 20130223015523.GV16142@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Zoltán,

* Zoltán Böszörményi (zb(at)cybertec(dot)at) wrote:
> The patch now passed "make check" in both cases.

Is v29 the latest version of this patch..?

Looking through the patch, I've noticed a couple of things:

First off, it's not in context diff format, which is the PG standard for
patch submission. Next, the documentation has a few issues:

- "Heavy-weight" should really be defined in terms of specific lock
types or modes. We don't use the 'heavyweight' term anywhere else in
the documentation that I've found.

- I'd reword this:

Abort any statement that tries to acquire a heavy-weight lock on rows,
pages, tables, indices or other objects and the lock(s) has to wait
more than the specified number of milliseconds.

as:

Abort any statement which waits longer than the specified number of
milliseconds while attempting to acquire a lock on ...

- I don't particularly like "lock_timeout_option", for a couple of
reasons. First is simply the name is terrible, but beyond that, it
strikes me that wanting to set both a 'per-lock timeout' and a
'overall waiting-for-locks timeout' at the same time would be a
reasonable use-case. If we're going to have 2 GUCs and we're going to
support each of those options, why not just let the user specify
values for each?

- This is a bit disingenuous:

If <literal>NOWAIT</> option is not specified and
<varname>lock_timeout</varname> is set and the lock or statement
(depending on <varname>lock_timeout_option</varname>) needs to wait
more than the specified value in milliseconds, the command reports
an error after timing out, rather than waiting indefinitely.

The SELECT would simply continue to wait until the lock is available.
That's a bit more specific than 'indefinitely'. Also, we might add a
sentence about statement_timeout as well, if we're going to document
what can happen if you don't use NOWAIT with your SELECT-FOR-UPDATE.
Should we add documentation to the other commands that wait for locks?

- Looks like this was ended mid-thought...:

+ * Lock a semaphore (decrement count), blocking if count would be < 0
+ * until a timeout triggers. Returns true if

- Not a big fan of this:

+ * See notes in PGSemaphoreLock.

- I'm not thrilled with the new API for defining the timeouts.
Particularly, I believe the more common convention for passing around
arrays of structures is to have an empty array at the end, which
avoids having to remember to update the # of entries every time it
gets changed. Of course, another option would be to use our existing
linked list implementation and its helper macros such as our
foreach() construct.

- As I've mentioned in other places/times, comments should be about why
we're doing something, not what we're doing- the code tells you that.
As such, comments like this really aren't great:
/* Assert request is sane */
/* Now re-enable the timer, if necessary. */

- Do we really need TimestampTzPlusMicroseconds..?

In general, I like this feature and a number of things above are pretty
small issues. The main questions, imv, are if we really need both
'options', and, if so, how they should work, and the API for defining
timers.

Thanks,

Stephen


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-23 18:15:06
Message-ID: 5129072A.6000401@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-02-23 02:55 keltezéssel, Stephen Frost írta:
> Zoltán,
>
> * Zoltán Böszörményi (zb(at)cybertec(dot)at) wrote:
>> The patch now passed "make check" in both cases.
> Is v29 the latest version of this patch..?

Yes, is was until you came up with your review... ;-)

> Looking through the patch, I've noticed a couple of things:
>
> First off, it's not in context diff format, which is the PG standard for
> patch submission.

Since moving to GIT, this expectation is obsolete. All PG hackers
became comfortable with the unified diff format, see references
from the list. A lot of hackers post "git diff" patches which cannot
produce context diff, either.

> Next, the documentation has a few issues:
>
> - "Heavy-weight" should really be defined in terms of specific lock
> types or modes. We don't use the 'heavyweight' term anywhere else in
> the documentation that I've found.

That's not strictly true but it's not widely used either:

$ find . -type f | xargs grep weight | grep heavy
./monitoring.sgml: <entry>Probe that fires when a request for a heavyweight lock (lmgr
lock)
./monitoring.sgml: <entry>Probe that fires when a request for a heavyweight lock (lmgr
lock)

>
> - I'd reword this:
>
> Abort any statement that tries to acquire a heavy-weight lock on rows,
> pages, tables, indices or other objects and the lock(s) has to wait
> more than the specified number of milliseconds.
>
> as:
>
> Abort any statement which waits longer than the specified number of
> milliseconds while attempting to acquire a lock on ...

OK.

>
> - I don't particularly like "lock_timeout_option", for a couple of
> reasons. First is simply the name is terrible, but beyond that, it
> strikes me that wanting to set both a 'per-lock timeout' and a
> 'overall waiting-for-locks timeout' at the same time would be a
> reasonable use-case. If we're going to have 2 GUCs and we're going to
> support each of those options, why not just let the user specify
> values for each?

OK, suggest names for the 2 GUCS, please.

> - This is a bit disingenuous:
>
> If <literal>NOWAIT</> option is not specified and
> <varname>lock_timeout</varname> is set and the lock or statement
> (depending on <varname>lock_timeout_option</varname>) needs to wait
> more than the specified value in milliseconds, the command reports
> an error after timing out, rather than waiting indefinitely.
>
> The SELECT would simply continue to wait until the lock is available.
> That's a bit more specific than 'indefinitely'. Also, we might add a
> sentence about statement_timeout as well, if we're going to document
> what can happen if you don't use NOWAIT with your SELECT-FOR-UPDATE.
> Should we add documentation to the other commands that wait for locks?
>
> - Looks like this was ended mid-thought...:
>
> + * Lock a semaphore (decrement count), blocking if count would be < 0
> + * until a timeout triggers. Returns true if

Right.

>
> - Not a big fan of this:
>
> + * See notes in PGSemaphoreLock.

Why? Copy&paste the *long* comment (a different one in each *_sema.c)
or omitting the comments is better? Suggest a better comment, and
it will be included.

> - I'm not thrilled with the new API for defining the timeouts.
> Particularly, I believe the more common convention for passing around
> arrays of structures is to have an empty array at the end, which
> avoids having to remember to update the # of entries every time it
> gets changed. Of course, another option would be to use our existing
> linked list implementation and its helper macros such as our
> foreach() construct.

A linked list might be better, actually I like it.

> - As I've mentioned in other places/times, comments should be about why
> we're doing something, not what we're doing- the code tells you that.
> As such, comments like this really aren't great:
> /* Assert request is sane */
> /* Now re-enable the timer, if necessary. */
>
> - Do we really need TimestampTzPlusMicroseconds..?

Well, my code is the only user for this macro but it's cleaner
than explicitly doing

#ifdef HAVE_INT64_TIMESTAMP
fin_time = timeout_start + delay_ms * (int64)1000;
#else
fin_time = timeout_start + delay_ms / 1000000.0;
#endif

>
> In general, I like this feature and a number of things above are pretty
> small issues. The main questions, imv, are if we really need both
> 'options', and, if so, how they should work, and the API for defining
> timers.
>
> Thanks,
>
> Stephen

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-24 02:23:29
Message-ID: 20130224022329.GZ16142@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Zoltán,

* Boszormenyi Zoltan (zb(at)cybertec(dot)at) wrote:
> 2013-02-23 02:55 keltezéssel, Stephen Frost írta:
> >First off, it's not in context diff format, which is the PG standard for
> >patch submission.
>
> Since moving to GIT, this expectation is obsolete.

No, it isn't. Patches posted to the list should be in context diff
format (and uncompressed unless it's too large) for easier reading.
That avoids having to download it, apply it to a git tree and then
create the diff ourselves. Indeed, the move to git had impact on this
at all.

> All PG hackers
> became comfortable with the unified diff format, see references
> from the list. A lot of hackers post "git diff" patches which cannot
> produce context diff, either.

It's quite possible to create a context diff from git- indeed, it's
documented on the http://wiki.postgresql.org/wiki/Working_with_Git
portion of the wiki, specifically to help people understand how to take
the changes they've made in their git forks and create context diffs to
post to the mailing lists. The preference for context diffs is also
still documented at http://wiki.postgresql.org/wiki/Submitting_a_Patch.

And, to be honest, I'm not sure how familiar you are with the history of
PG in this area, but I've been through all of it from the pre-git CVS
days, through the migration to git, to the exclusive use of git. I feel
quite confident that the preference for context diffs hasn't changed.

> That's not strictly true but it's not widely used either:
>
> $ find . -type f | xargs grep weight | grep heavy
> ./monitoring.sgml: <entry>Probe that fires when a request for a
> heavyweight lock (lmgr lock)
> ./monitoring.sgml: <entry>Probe that fires when a request for a
> heavyweight lock (lmgr lock)

Interesting. That didn't show up in the searches that I was doing
through the web interface, though it does now. If we're going to use
that term, we should define it in the lock documentation. If not, then
we should avoid it everywhere. I'm fine with either.

> >- I don't particularly like "lock_timeout_option", for a couple of
> > reasons. First is simply the name is terrible, but beyond that, it
> > strikes me that wanting to set both a 'per-lock timeout' and a
> > 'overall waiting-for-locks timeout' at the same time would be a
> > reasonable use-case. If we're going to have 2 GUCs and we're going to
> > support each of those options, why not just let the user specify
> > values for each?
>
> OK, suggest names for the 2 GUCS, please.

lock_timeout_wait and lock_timeout_stmt_wait ? Though I've been really
wondering to myself as to if we need both of these options as well as
statement_timeout. Perhaps you can explain the use case for each
option and how they're distinct from each other? Indeed, the use-case
that I'm envisioning is focused around "don't wait more than 'X' for the
relation-level locks" which would allow you to distinguish queries that
are, most likely anyway, making progress, from those which have been
caught behind a lock. That would match your 'per-statement' lock
timeout concept, iiuc, though I think it might be more simply
implemented.

> >- Not a big fan of this:
> >
> >+ * See notes in PGSemaphoreLock.
>
> Why? Copy&paste the *long* comment (a different one in each *_sema.c)
> or omitting the comments is better? Suggest a better comment, and
> it will be included.

How about, for starters, there's more than one PGSemaphoreLock? Second,
as you'll note in posix_sema.c, it's useful to say more than just "look
over there". I'm not suggesting a mass copy/paste, but more than 4
words would be valuable.

> >- Do we really need TimestampTzPlusMicroseconds..?
>
> Well, my code is the only user for this macro but it's cleaner
> than explicitly doing

To be honest, I was really wondering why TimestampTzPlusMilliseconds
isn't sufficient, not suggesting that we litter the code with #ifdef's.

Thanks,

Stephen


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-24 03:07:13
Message-ID: 512983E1.1080307@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/23/2013 01:15 PM, Boszormenyi Zoltan wrote:
>>
>> First off, it's not in context diff format, which is the PG standard for
>> patch submission.
>
>
> Since moving to GIT, this expectation is obsolete. All PG hackers
> became comfortable with the unified diff format, see references
> from the list. A lot of hackers post "git diff" patches which cannot
> produce context diff, either.

I am not aware that project policy has changed in the slightest in this
regard.

Every unified diff can be turned into a context diff by passing it
though "filterdiff --format=context".

cheers

andrew


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-24 03:15:08
Message-ID: 20130224031508.GB16142@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Andrew Dunstan (andrew(at)dunslane(dot)net) wrote:
> Every unified diff can be turned into a context diff by passing it
> though "filterdiff --format=context".

On that point, annoyingly, it's not accurate for every diff. In
particular, when I tried it with the diff for 'v29' of this lock_timeout
patch, it only returns the context diff for the first two files:

zcat 2-lock_timeout-v29.patch.gz | grep -- '--- postgresql' | wc -l
22

zcat 2-lock_timeout-v29.patch.gz | \
filterdiff --format=context | \
grep '\*\*\* postgresql' | wc -l
2

In the end, I did create a local git branch, commit the patch, then
diff it back against master using my context-diff git helper to get
something easier to read through. Rather annoying.

Thanks,

Stephen


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-24 08:27:01
Message-ID: 5129CED5.3050004@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-02-24 03:23 keltezéssel, Stephen Frost írta:
> Zoltán,
>
> * Boszormenyi Zoltan (zb(at)cybertec(dot)at) wrote:
>> 2013-02-23 02:55 keltezéssel, Stephen Frost írta:
>>> First off, it's not in context diff format, which is the PG standard for
>>> patch submission.
>> Since moving to GIT, this expectation is obsolete.
> No, it isn't. Patches posted to the list should be in context diff
> format (and uncompressed unless it's too large) for easier reading.
> That avoids having to download it, apply it to a git tree and then
> create the diff ourselves. Indeed, the move to git had impact on this
> at all.

I remembered this mail from The Master(tm):
http://www.postgresql.org/message-id/21555.1339866293@sss.pgh.pa.us

>
>> All PG hackers
>> became comfortable with the unified diff format, see references
>> from the list. A lot of hackers post "git diff" patches which cannot
>> produce context diff, either.
> It's quite possible to create a context diff from git- indeed, it's
> documented on the http://wiki.postgresql.org/wiki/Working_with_Git
> portion of the wiki, specifically to help people understand how to take
> the changes they've made in their git forks and create context diffs to
> post to the mailing lists. The preference for context diffs is also
> still documented at http://wiki.postgresql.org/wiki/Submitting_a_Patch.
>
> And, to be honest, I'm not sure how familiar you are with the history of
> PG in this area, but I've been through all of it from the pre-git CVS
> days, through the migration to git, to the exclusive use of git.

I started with Postgres95, though only using the source tarballs.
I started following CVS at around 2003 or so.

> I feel
> quite confident that the preference for context diffs hasn't changed.

With stats for the February, 2013:
Name ctx ctx/gitunif/git unif
Dimitry Fontaine 40 1 0
Ian Lawrence Barwick 0 1 0 0
Alvaro Herrera 10 0 5 0
Pavel Stehule 9 0 0 0
Heikki Linnakangas 0 0 7 0
Michael Paquier 0 0 6 0
Andres Freund 0 0 5 0
Alexander Law 0 0 1 0
Etsuro Fujita 0 0 4 0
Amit Kapila 4 0 0 0
Kevin Grittner 2 0 3 0
Gurjeet Singh 0 0 2 0
Erik Rijkers 0 0 0 1
Zoltan Boszormenyi 0 0 2 0
Tomas Vondra 0 0 2 0
Bruce Momjian 0 4 0 0
Fujii Masao 1 0 0 0
David Fetter 5 0 0 0
Jonathan Rogers 0 0 1 0
Andrew Dunstan 2 0 0 0
Manlio Perillo 0 0 1 0
Dean Rasheed 0 1 2 0
Jeff Janes 0 2 1 0
Phil Sorber 0 3 2 0
Mark Wong 0 1 0 0
Ivan Lezhnjov IV 0 0 1 0
Kohei Gagai 0 0 2 0
MauMau 1 0 0 0
Tom Lane 0 1 0 0
Sean Chittenden 0 0 2 0
Albe Laurenz 0 1 0 0
Daniel Farina 1 0 0 0
Simon Riggs 0 0 3 0
Peter Eisentraut 0 0 4 0

Plain context diffs: 39
Context diffs generated from git: 14
"git diff" patches: 57
Plain unified diff: 1

Some mails contained more than one patch, these were
counted as-is. One patch is one patch.

So, more than halfof the recently posted patches come
directly from "git diff". The preference has changed.
But indeed, plain "diff -u" is cleanly in the minority.
I can repost my patch from "git diff", too, to be in
the majority. :-P

>> That's not strictly true but it's not widely used either:
>>
>> $ find . -type f | xargs grep weight | grep heavy
>> ./monitoring.sgml: <entry>Probe that fires when a request for a
>> heavyweight lock (lmgr lock)
>> ./monitoring.sgml: <entry>Probe that fires when a request for a
>> heavyweight lock (lmgr lock)
> Interesting. That didn't show up in the searches that I was doing
> through the web interface, though it does now. If we're going to use
> that term, we should define it in the lock documentation. If not, then
> we should avoid it everywhere. I'm fine with either.

Me too, the documentation should be consistent. I will remove the
"heavyweight" term.

>
>>> - I don't particularly like "lock_timeout_option", for a couple of
>>> reasons. First is simply the name is terrible, but beyond that, it
>>> strikes me that wanting to set both a 'per-lock timeout' and a
>>> 'overall waiting-for-locks timeout' at the same time would be a
>>> reasonable use-case. If we're going to have 2 GUCs and we're going to
>>> support each of those options, why not just let the user specify
>>> values for each?
>> OK, suggest names for the 2 GUCS, please.
> lock_timeout_wait and lock_timeout_stmt_wait ?

Hm. How about without the "_wait" suffix?
Or lock_timeout vs statement_lock_timeout?

> Though I've been really
> wondering to myself as to if we need both of these options as well as
> statement_timeout.

> Perhaps you can explain the use case for each
> option and how they're distinct from each other?

statement_timeout is the legacy behaviour, it should be kept.
It's behaviour is well understood: the statement finishes under
the specified time or it throws an error. The problem from the
application point of view is that the error can happen while
the tuples are being transferred to the client, so the recordset
can be cut in half.

"lock_timeout_stmt" (or lock_timeout_option = per_statement)
is somewhat extending the statement_timeout as only the
time waiting on locks are counted, so SELECT FOR UPDATE/etc.
may throw an error but if all rows are collected already, or
plain SELECT is run, the application gets them all.
This seems to follow the Microsoft SQL Server semantics:
http://msdn.microsoft.com/en-us/library/ms189470.aspx

The per-lock lock_timeout was the result of a big Informix
project ported to PostgreSQL, this follows Informix semantics:
http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqls.doc/sqls788.htm

> Indeed, the use-case
> that I'm envisioning is focused around "don't wait more than 'X' for the
> relation-level locks" which would allow you to distinguish queries that
> are, most likely anyway, making progress, from those which have been
> caught behind a lock. That would match your 'per-statement' lock
> timeout concept, iiuc, though I think it might be more simply
> implemented.
>
>>> - Not a big fan of this:
>>>
>>> + * See notes in PGSemaphoreLock.
>> Why? Copy&paste the *long* comment (a different one in each *_sema.c)
>> or omitting the comments is better? Suggest a better comment, and
>> it will be included.
> How about, for starters, there's more than one PGSemaphoreLock? Second,
> as you'll note in posix_sema.c, it's useful to say more than just "look
> over there". I'm not suggesting a mass copy/paste, but more than 4
> words would be valuable.

OK.

>
>>> - Do we really need TimestampTzPlusMicroseconds..?
>> Well, my code is the only user for this macro but it's cleaner
>> than explicitly doing
> To be honest, I was really wondering why TimestampTzPlusMilliseconds
> isn't sufficient, not suggesting that we litter the code with #ifdef's.

Because Timestamp[Tz] is microsecond resolution, and the timeout
GUCs are in milliseconds. Adding up time differences (and rounding
or truncating them to millisecond in the process) would make the
per-statement lock timeout lose precision...

>
> Thanks,
>
> Stephen

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)
Date: 2013-02-24 08:44:54
Message-ID: 5129D306.5010804@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 24.02.2013 05:07, Andrew Dunstan wrote:
> On 02/23/2013 01:15 PM, Boszormenyi Zoltan wrote:
>>>
>>> First off, it's not in context diff format, which is the PG standard for
>>> patch submission.
>>
>> Since moving to GIT, this expectation is obsolete. All PG hackers
>> became comfortable with the unified diff format, see references
>> from the list. A lot of hackers post "git diff" patches which cannot
>> produce context diff, either.
>
> I am not aware that project policy has changed in the slightest in this
> regard.

I can't speak for others, but I personally don't care whether a patch is
posted in unified or context diff format. Not as a general rule, anyway;
patches that modify a few lines here and there are generally more
readable in unified format, as the old and new lines are lined up right
next to each other:

@@ -329,7 +346,7 @@ foreign_expr_walker(Node *node, foreign_expr_cxt
*context)
static bool
is_builtin(Oid oid)
{
- return (oid < FirstNormalObjectId);
+ return (oid < FirstBootstrapObjectId);
}

Then again, more complicated patches that modify large chunks of code
are often unreadable when the - and + lines are mixed up in the same
chunk; those are more readable in context format.

So if you want to be kind to readers, look at the patch and choose the
format depending on which one makes it look better. But there's no need
to make a point of it when someone posts in "wrong" format.

> Every unified diff can be turned into a context diff by passing it
> though "filterdiff --format=context".

Yep. And in emacs, there's "diff-unified->context" command. I bet most
editors have a similar functionality these days.

- Heikki


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-24 12:30:15
Message-ID: 512A07D7.3080709@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen,

2013-02-23 02:55 keltezéssel, Stephen Frost írta:
> Zoltán,
>
> * Zoltán Böszörményi (zb(at)cybertec(dot)at) wrote:
>> The patch now passed "make check" in both cases.
> Is v29 the latest version of this patch..?

attached is v30, I hope with everything fixed.
- List based enable/disable_multiple_timeouts()
- separate per-lock and per-statement lock_timeout variants
- modified comments and documentation

Patch is "git diff" format. Please, review.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/

Attachment Content-Type Size
2-lock_timeout-v30.patch.gz application/x-tar 11.8 KB

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-24 14:03:44
Message-ID: 20130224140344.GG16142@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Boszormenyi Zoltan (zb(at)cybertec(dot)at) wrote:
> 2013-02-24 03:23 keltezéssel, Stephen Frost írta:
> >No, it isn't. Patches posted to the list should be in context diff
> >format (and uncompressed unless it's too large) for easier reading.
> >That avoids having to download it, apply it to a git tree and then
> >create the diff ourselves. Indeed, the move to git had impact on this
> >at all.
>
> I remembered this mail from The Master(tm):
> http://www.postgresql.org/message-id/21555.1339866293@sss.pgh.pa.us

Which matches exactly what I was saying- context diff provides easier
readind for review purposes, which is presumably what you were looking
to have happen when you posted this patch to the mailing list. And it's
still the documented expectation and project policy, regardless of any
individual's email. If we're going to change it, then the
documentation, et al, should be updated as well.

For my 2c, I still prefer context diffs posted to the mailing lists,
but would also like to see more people posting pull requests. That
doesn't make it project policy though.

> So, more than halfof the recently posted patches come
> directly from "git diff". The preference has changed.

No, more people have ignored the project policy than not- that doesn't
say anything about the preference or what the policy is.

> >lock_timeout_wait and lock_timeout_stmt_wait ?
>
> Hm. How about without the "_wait" suffix?
> Or lock_timeout vs statement_lock_timeout?

I could live without the _wait suffix, but it occurs to me that none of
these really indicate that statement_lock_timeout is an accumulated
timeout. Perhaps it should say 'total lock wait timeout' or similar?

> statement_timeout is the legacy behaviour, it should be kept.
> It's behaviour is well understood: the statement finishes under
> the specified time or it throws an error. The problem from the
> application point of view is that the error can happen while
> the tuples are being transferred to the client, so the recordset
> can be cut in half.
>
> "lock_timeout_stmt" (or lock_timeout_option = per_statement)
> is somewhat extending the statement_timeout as only the
> time waiting on locks are counted, so SELECT FOR UPDATE/etc.
> may throw an error but if all rows are collected already, or
> plain SELECT is run, the application gets them all.
> This seems to follow the Microsoft SQL Server semantics:
> http://msdn.microsoft.com/en-us/library/ms189470.aspx

The documentation at that link appears to match what 'lock_timeout'
would do. Note that it says 'a lock' here: "When a wait for a lock
exceeds the time-out value, an error is returned.", or have you tested
the actual behavior and seen it treat this value as an accumulated
time across all lock waits for an entire statement?

> The per-lock lock_timeout was the result of a big Informix
> project ported to PostgreSQL, this follows Informix semantics:
> http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqls.doc/sqls788.htm

I agree that the Informix command looks to be per-lock, but based on my
simple reading of both documentation links, I think they're actually the
same behavior and neither is about an accumulated time.

> Because Timestamp[Tz] is microsecond resolution, and the timeout
> GUCs are in milliseconds. Adding up time differences (and rounding
> or truncating them to millisecond in the process) would make the
> per-statement lock timeout lose precision...

Removing the accumulation-based option would fix that.. :D

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)
Date: 2013-02-24 14:08:34
Message-ID: 20130224140834.GH16142@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Heikki Linnakangas (hlinnakangas(at)vmware(dot)com) wrote:
> So if you want to be kind to readers, look at the patch and choose
> the format depending on which one makes it look better. But there's
> no need to make a point of it when someone posts in "wrong" format.

To be more precise- my main complaint about this is that this patch is
making changes to multi-line comments and to documentation, both of
which get very annoying to try and read in uniform diff format.
Patches that don't do one or the other of those are likely incomplete
anyway.

As another point, it's also the very first thing that we document in
http://wiki.postgresql.org/wiki/Reviewing_a_Patch to check for.

> >Every unified diff can be turned into a context diff by passing it
> >though "filterdiff --format=context".
>
> Yep. And in emacs, there's "diff-unified->context" command. I bet
> most editors have a similar functionality these days.

And it probably doesn't work for every patch either, just as filterdiff
doesn't (see my other email).

Thanks,

Stephen


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-24 15:14:41
Message-ID: 512A2E61.7080303@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-02-24 15:03 keltezéssel, Stephen Frost írta:
> * Boszormenyi Zoltan (zb(at)cybertec(dot)at) wrote:
>> 2013-02-24 03:23 keltezéssel, Stephen Frost írta:
>>> No, it isn't. Patches posted to the list should be in context diff
>>> format (and uncompressed unless it's too large) for easier reading.
>>> That avoids having to download it, apply it to a git tree and then
>>> create the diff ourselves. Indeed, the move to git had impact on this
>>> at all.
>> I remembered this mail from The Master(tm):
>> http://www.postgresql.org/message-id/21555.1339866293@sss.pgh.pa.us
> Which matches exactly what I was saying- context diff provides easier
> readind for review purposes, which is presumably what you were looking
> to have happen when you posted this patch to the mailing list. And it's
> still the documented expectation and project policy, regardless of any
> individual's email. If we're going to change it, then the
> documentation, et al, should be updated as well.
>
> For my 2c, I still prefer context diffs posted to the mailing lists,
> but would also like to see more people posting pull requests. That
> doesn't make it project policy though.
>
>> So, more than halfof the recently posted patches come
>> directly from "git diff". The preference has changed.
> No, more people have ignored the project policy than not- that doesn't
> say anything about the preference or what the policy is.
>
>>> lock_timeout_wait and lock_timeout_stmt_wait ?
>> Hm. How about without the "_wait" suffix?
>> Or lock_timeout vs statement_lock_timeout?
> I could live without the _wait suffix, but it occurs to me that none of
> these really indicate that statement_lock_timeout is an accumulated
> timeout. Perhaps it should say 'total lock wait timeout' or similar?
>
>> statement_timeout is the legacy behaviour, it should be kept.
>> It's behaviour is well understood: the statement finishes under
>> the specified time or it throws an error. The problem from the
>> application point of view is that the error can happen while
>> the tuples are being transferred to the client, so the recordset
>> can be cut in half.
>>
>> "lock_timeout_stmt" (or lock_timeout_option = per_statement)
>> is somewhat extending the statement_timeout as only the
>> time waiting on locks are counted, so SELECT FOR UPDATE/etc.
>> may throw an error but if all rows are collected already, or
>> plain SELECT is run, the application gets them all.
>> This seems to follow the Microsoft SQL Server semantics:
>> http://msdn.microsoft.com/en-us/library/ms189470.aspx
> The documentation at that link appears to match what 'lock_timeout'
> would do. Note that it says 'a lock' here: "When a wait for a lock
> exceeds the time-out value, an error is returned.", or have you tested
> the actual behavior and seen it treat this value as an accumulated
> time across all lock waits for an entire statement?

(Note to myself: don't read with headache.)
I may have misread the MSDN link.

>
>> The per-lock lock_timeout was the result of a big Informix
>> project ported to PostgreSQL, this follows Informix semantics:
>> http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqls.doc/sqls788.htm
> I agree that the Informix command looks to be per-lock, but based on my
> simple reading of both documentation links, I think they're actually the
> same behavior and neither is about an accumulated time.

You seem to be right.

>> Because Timestamp[Tz] is microsecond resolution, and the timeout
>> GUCs are in milliseconds. Adding up time differences (and rounding
>> or truncating them to millisecond in the process) would make the
>> per-statement lock timeout lose precision...
> Removing the accumulation-based option would fix that.. :D

To call out the wrath of Tom? No, thanks. :-D
IIRC, he was the one who didn't like the per-lock behaviour
the first time he looked at this patch in an earlier form
back in 2010/2011 and he proposed this use case instead.
If not him, then someone else. I got the idea from this list...

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)
Date: 2013-02-24 17:39:38
Message-ID: CAGTBQpartOmBJeppiL--sDWUGOUsiiiN=POfw9sVw0kC5+GLkQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Feb 24, 2013 at 11:08 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Heikki Linnakangas (hlinnakangas(at)vmware(dot)com) wrote:
>> So if you want to be kind to readers, look at the patch and choose
>> the format depending on which one makes it look better. But there's
>> no need to make a point of it when someone posts in "wrong" format.
>
> To be more precise- my main complaint about this is that this patch is
> making changes to multi-line comments and to documentation, both of
> which get very annoying to try and read in uniform diff format.
> Patches that don't do one or the other of those are likely incomplete
> anyway.
>
> As another point, it's also the very first thing that we document in
> http://wiki.postgresql.org/wiki/Reviewing_a_Patch to check for.

TBH, that wiki link seems to suggest that *having context* is the
point of the requirement (to be able to merge with fuzz).

Both unified and context formats have context.


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)
Date: 2013-02-24 19:42:02
Message-ID: 512A6D0A.2070304@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/24/2013 12:39 PM, Claudio Freire wrote:
> On Sun, Feb 24, 2013 at 11:08 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> * Heikki Linnakangas (hlinnakangas(at)vmware(dot)com) wrote:
>>> So if you want to be kind to readers, look at the patch and choose
>>> the format depending on which one makes it look better. But there's
>>> no need to make a point of it when someone posts in "wrong" format.
>> To be more precise- my main complaint about this is that this patch is
>> making changes to multi-line comments and to documentation, both of
>> which get very annoying to try and read in uniform diff format.
>> Patches that don't do one or the other of those are likely incomplete
>> anyway.
>>
>> As another point, it's also the very first thing that we document in
>> http://wiki.postgresql.org/wiki/Reviewing_a_Patch to check for.
>
> TBH, that wiki link seems to suggest that *having context* is the
> point of the requirement (to be able to merge with fuzz).
>
> Both unified and context formats have context.
>

No, you're missing the point. Some people find reading context diffs
much easier than reading unified diffs. That's why context diffs are the
project's stated preference.

cheers

andrew


From: Peter Geoghegan <peter(dot)geoghegan86(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)
Date: 2013-02-24 20:08:33
Message-ID: CAEYLb_XAx_CPVexVciazZ=2k7WvUVnJuETioWZwaFXpuMK58YQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 24 February 2013 08:44, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:
> I can't speak for others, but I personally don't care whether a patch is
> posted in unified or context diff format. Not as a general rule, anyway;
> patches that modify a few lines here and there are generally more readable
> in unified format, as the old and new lines are lined up right next to each
> other:

I don't care either. My personal preference is context diff format,
but then that's what I usually see anyway. I don't use filterdiff or
anything like that. I just have a strong habit of using feature
branches extensively, even for patches that I'm reviewing, and my
setup makes that easy to create from a patch file. It's quite a rare
occurrence for me to care enough about a patch to want to eyeball the
code (and not just read the author's summary), and yet not care about
it enough to make a feature branch for it. I can see how other
people's habits might differ from my own here, and that they might
reasonably state a preference for unified, which is fine. I developed
a preference for unified over time, having originally just used the
format on the advice of the wiki.

--
Regards,
Peter Geoghegan


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)
Date: 2013-02-24 21:31:44
Message-ID: 20130224213144.GL16142@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Claudio Freire (klaussfreire(at)gmail(dot)com) wrote:
> > As another point, it's also the very first thing that we document in
> > http://wiki.postgresql.org/wiki/Reviewing_a_Patch to check for.
>
> TBH, that wiki link seems to suggest that *having context* is the
> point of the requirement (to be able to merge with fuzz).

The PG wiki link states "Is the patch in context diff format?" and
provides a link to the wikipedia article about *that specific format*.
There's absolutely zero confusion over what "context diff format" means.

Thanks,

Stephen


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-25 08:23:34
Message-ID: 512B1F86.9020405@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-02-24 16:14 keltezéssel, Boszormenyi Zoltan írta:
> 2013-02-24 15:03 keltezéssel, Stephen Frost írta:
>> * Boszormenyi Zoltan (zb(at)cybertec(dot)at) wrote:
>>> 2013-02-24 03:23 keltezéssel, Stephen Frost írta:
>>>> No, it isn't. Patches posted to the list should be in context diff
>>>> format (and uncompressed unless it's too large) for easier reading.
>>>> That avoids having to download it, apply it to a git tree and then
>>>> create the diff ourselves. Indeed, the move to git had impact on this
>>>> at all.
>>> I remembered this mail from The Master(tm):
>>> http://www.postgresql.org/message-id/21555.1339866293@sss.pgh.pa.us
>> Which matches exactly what I was saying- context diff provides easier
>> readind for review purposes, which is presumably what you were looking
>> to have happen when you posted this patch to the mailing list. And it's
>> still the documented expectation and project policy, regardless of any
>> individual's email. If we're going to change it, then the
>> documentation, et al, should be updated as well.
>>
>> For my 2c, I still prefer context diffs posted to the mailing lists,
>> but would also like to see more people posting pull requests. That
>> doesn't make it project policy though.
>>
>>> So, more than halfof the recently posted patches come
>>> directly from "git diff". The preference has changed.
>> No, more people have ignored the project policy than not- that doesn't
>> say anything about the preference or what the policy is.
>>
>>>> lock_timeout_wait and lock_timeout_stmt_wait ?
>>> Hm. How about without the "_wait" suffix?
>>> Or lock_timeout vs statement_lock_timeout?
>> I could live without the _wait suffix, but it occurs to me that none of
>> these really indicate that statement_lock_timeout is an accumulated
>> timeout. Perhaps it should say 'total lock wait timeout' or similar?
>>
>>> statement_timeout is the legacy behaviour, it should be kept.
>>> It's behaviour is well understood: the statement finishes under
>>> the specified time or it throws an error. The problem from the
>>> application point of view is that the error can happen while
>>> the tuples are being transferred to the client, so the recordset
>>> can be cut in half.
>>>
>>> "lock_timeout_stmt" (or lock_timeout_option = per_statement)
>>> is somewhat extending the statement_timeout as only the
>>> time waiting on locks are counted, so SELECT FOR UPDATE/etc.
>>> may throw an error but if all rows are collected already, or
>>> plain SELECT is run, the application gets them all.
>>> This seems to follow the Microsoft SQL Server semantics:
>>> http://msdn.microsoft.com/en-us/library/ms189470.aspx
>> The documentation at that link appears to match what 'lock_timeout'
>> would do. Note that it says 'a lock' here: "When a wait for a lock
>> exceeds the time-out value, an error is returned.", or have you tested
>> the actual behavior and seen it treat this value as an accumulated
>> time across all lock waits for an entire statement?
>
> (Note to myself: don't read with headache.)
> I may have misread the MSDN link.
>
>>
>>> The per-lock lock_timeout was the result of a big Informix
>>> project ported to PostgreSQL, this follows Informix semantics:
>>> http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqls.doc/sqls788.htm
>>>
>> I agree that the Informix command looks to be per-lock, but based on my
>> simple reading of both documentation links, I think they're actually the
>> same behavior and neither is about an accumulated time.
>
> You seem to be right.
>
>>> Because Timestamp[Tz] is microsecond resolution, and the timeout
>>> GUCs are in milliseconds. Adding up time differences (and rounding
>>> or truncating them to millisecond in the process) would make the
>>> per-statement lock timeout lose precision...
>> Removing the accumulation-based option would fix that.. :D
>
> To call out the wrath of Tom? No, thanks. :-D
> IIRC, he was the one who didn't like the per-lock behaviour
> the first time he looked at this patch in an earlier form
> back in 2010/2011 and he proposed this use case instead.
> If not him, then someone else. I got the idea from this list...

Another question just popped up. Now, that
bool enable_multiple_timeouts(List *timeouts);
exists, do we really need the singular versions?

Since the "timeout after N msec" have the per-lock and per-stmt
versions, enable_timeout_after() gained a new argument to
distinguish between the two cases and every occurrences of
this function happen to just use "0" here. The only usage of the
per-stmt variant is used with enable_multiple_timeouts().

Wouldn't it be better to have a single
bool enable_timeouts(List *timeouts);
instead?

>
> Best regards,
> Zoltán Böszörményi
>

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)
Date: 2013-02-25 12:56:40
Message-ID: CA+TgmoY6ZkE_fteHDQyzna5=yGYCYP3L7xEBByHeNPHswrMqKg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Feb 24, 2013 at 4:31 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Claudio Freire (klaussfreire(at)gmail(dot)com) wrote:
>> > As another point, it's also the very first thing that we document in
>> > http://wiki.postgresql.org/wiki/Reviewing_a_Patch to check for.
>>
>> TBH, that wiki link seems to suggest that *having context* is the
>> point of the requirement (to be able to merge with fuzz).
>
> The PG wiki link states "Is the patch in context diff format?" and
> provides a link to the wikipedia article about *that specific format*.
> There's absolutely zero confusion over what "context diff format" means.

True, but I'm with Heikki: it's a pedantic and unhelpful guideline.
Everyone here who reviews patches regularly knows how to, and probably
does, convert between those formats with regularity. Making patch
submitters feel badly because they've used the "wrong" format does not
advance the goals of the project.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-25 14:05:24
Message-ID: 20130225140524.GN16142@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Zoltan,

* Boszormenyi Zoltan (zb(at)cybertec(dot)at) wrote:
> Another question just popped up. Now, that
> bool enable_multiple_timeouts(List *timeouts);
> exists, do we really need the singular versions?
>
> Since the "timeout after N msec" have the per-lock and per-stmt
> versions, enable_timeout_after() gained a new argument to
> distinguish between the two cases and every occurrences of
> this function happen to just use "0" here. The only usage of the
> per-stmt variant is used with enable_multiple_timeouts().

For my 2c, I didn't partciularly care for changing
enable_timeout_after() by adding an extra parameter that ended up being
passed as ',0'.. Perhaps make it a wrapper instead of changing the
definition and leaving the invocations of it alone?

> Wouldn't it be better to have a single
> bool enable_timeouts(List *timeouts);
> instead?

This might also work though, if everything is updated to use it and it's
relatively clean. I realize for the aggregate case, you have to have
it, but I really don't like the changes to have to reset the counter
either.

Tom, can you comment on your thoughts around this notion of an aggregate
time constraint for waiting on locks? As mentioned upthread, I like the
idea of having an upper-limit on waiting for relation-level locks, but
once you're past that, I'm not sure that an aggregate waiting-on-locks
is any better than the overall statement-level timeout and it seems
somewhat messy, to me anyway.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)
Date: 2013-02-25 14:11:27
Message-ID: 20130225141127.GO16142@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> True, but I'm with Heikki: it's a pedantic and unhelpful guideline.

Then let's change it, drop the preference, and update the documentation.
I'd certainly prefer that to getting shot for pointing out to patch
submitters that they're not following our documented guidelines.

> Everyone here who reviews patches regularly knows how to, and probably
> does, convert between those formats with regularity. Making patch
> submitters feel badly because they've used the "wrong" format does not
> advance the goals of the project.

For my part, I'd rather put the onus on the submitter to submit a
readable patch in the first part than ask the reviewer and anyone else
interested in looking at the patch to fix it. That's even more true
when you consider the archives and reading patches through the web
interface (though downloading the original mail message has gotten
better with the new archive code).

Thanks,

Stephen


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)
Date: 2013-02-25 14:17:25
Message-ID: 20130225141725.GA16569@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-02-25 09:11:27 -0500, Stephen Frost wrote:
> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> > True, but I'm with Heikki: it's a pedantic and unhelpful guideline.
>
> Then let's change it, drop the preference, and update the documentation.

+1

> > Everyone here who reviews patches regularly knows how to, and probably
> > does, convert between those formats with regularity. Making patch
> > submitters feel badly because they've used the "wrong" format does not
> > advance the goals of the project.
>
> For my part, I'd rather put the onus on the submitter to submit a
> readable patch in the first part than ask the reviewer and anyone else
> interested in looking at the patch to fix it. That's even more true
> when you consider the archives and reading patches through the web
> interface (though downloading the original mail message has gotten
> better with the new archive code).

Well, the point is that you cannot satisfy enough people with either
choice anyway. I for one feel much more comfortable sending patches in a
format that I can actually read without thinking too much. Which is the
case for unified but definitely not for context. But its different for
others.

I gave in and made my mailreader convert all patches to unified for
reading, that way I don't care about other peoples preferences for one.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)
Date: 2013-02-25 14:25:50
Message-ID: 360.1361802350@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
>> True, but I'm with Heikki: it's a pedantic and unhelpful guideline.

> Then let's change it, drop the preference, and update the documentation.

I think we should drop the hard requirement for context-format, and
instead say that it must not be plain (context-free) diff, since that
clearly *is* a hard requirement.

However, I liked the upthread suggestion (I think it was from Heikki)
that we recommend that submitters actually take a moment to think about
which format is more readable for their particular patch. Readability
is important not only to help people who just give the patch a quick
eyeball, but also to help the inevitable situations where hunks have
to be applied by hand because the underlying code has changed. The
less readable the patch, the more likely an error in doing that.
(And I trust we've all learned by now that git isn't so good at merging
that this isn't a problem.)

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)
Date: 2013-02-25 14:38:12
Message-ID: 20130225143812.GQ16142@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> >> True, but I'm with Heikki: it's a pedantic and unhelpful guideline.
>
> > Then let's change it, drop the preference, and update the documentation.
>
> I think we should drop the hard requirement for context-format, and
> instead say that it must not be plain (context-free) diff, since that
> clearly *is* a hard requirement.

Alright, I'll start making those changes.

> However, I liked the upthread suggestion (I think it was from Heikki)
> that we recommend that submitters actually take a moment to think about
> which format is more readable for their particular patch. Readability
> is important not only to help people who just give the patch a quick
> eyeball, but also to help the inevitable situations where hunks have
> to be applied by hand because the underlying code has changed. The
> less readable the patch, the more likely an error in doing that.
> (And I trust we've all learned by now that git isn't so good at merging
> that this isn't a problem.)

I'll include that point in my changes but I consider the chances of that
actually happening with any regularity to be essentially zero. Reducing
our requirement to a level where the default passes means that nearly
everyone, who hasn't already changed things to use a non-default
automatically, is going to just use the default.

Thanks,

Stephen


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)
Date: 2013-02-25 16:39:12
Message-ID: 512B93B0.7090000@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-02-25 15:25 keltezéssel, Tom Lane írta:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
>> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
>>> True, but I'm with Heikki: it's a pedantic and unhelpful guideline.
>> Then let's change it, drop the preference, and update the documentation.
> I think we should drop the hard requirement for context-format, and
> instead say that it must not be plain (context-free) diff, since that
> clearly *is* a hard requirement.
>
> However, I liked the upthread suggestion (I think it was from Heikki)
> that we recommend that submitters actually take a moment to think about
> which format is more readable for their particular patch. Readability
> is important not only to help people who just give the patch a quick
> eyeball, but also to help the inevitable situations where hunks have
> to be applied by hand because the underlying code has changed. The
> less readable the patch, the more likely an error in doing that.

+1

I think the readability is mostly the de-facto threshold. Considering
that quite a few plain "git diff" patches were committed during this CF
but those were readable in that format, even biggies like "teach
receivexlog to switch timelines" which (I just browsed it) had parts
that rewrote code in a way that the diff had pieces with different
indentation intermixed. This can constitute to being unreadable
at times but it also depends on the reader.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)
Date: 2013-02-25 16:48:08
Message-ID: 20130225164808.GR16142@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Stephen Frost (sfrost(at)snowman(dot)net) wrote:
> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> > I think we should drop the hard requirement for context-format, and
> > instead say that it must not be plain (context-free) diff, since that
> > clearly *is* a hard requirement.
>
> Alright, I'll start making those changes.

I've updated the wiki pages accordingly, though there's now a number of
translated pages which also need updating to reflect this change.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-27 03:06:44
Message-ID: 20130227030644.GB16142@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Zoltan,

* Boszormenyi Zoltan (zb(at)cybertec(dot)at) wrote:
> attached is v30, I hope with everything fixed.

Making progress, certainly.

Given the hack to the API of enable_timeout_after() and the need for
timeout_reset_base_time(), I'm just going to voice my objection to the
"accumulated wait time on locks" portion again. I still like the idea
of a timeout for waiting on relation-level locks, as we acquire those
all up-front and we'd be able to just set a timeout at the appropriate
point and then release it when we're past acquiring the relation-level
locks. Seems like that would be much cleaner.

On the other hand, if we're going to go down this route, I'm really
starting to wonder if it should be the timeout systems responsibility to
keep track of such accumulated time.

Other than that..

> - List based enable/disable_multiple_timeouts()

That looks good, like the use of foreach(), etc, but I don't like how
you've set up delay_ms as a pointer..? Looks to be to allow you to
initialize the TimeoutParams structs early in proc.c..? Is there
another reason it needs to be a pointer that I'm missing? Why not build
the TimeoutParam strcutures in the if() blocks that check if the GUCs
are set..?

> - separate per-lock and per-statement lock_timeout variants
> - modified comments and documentation

Thanks.

Stephen


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-27 09:40:16
Message-ID: 512DD480.2060400@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-02-27 04:06 keltezéssel, Stephen Frost írta:
> Zoltan,
>
> * Boszormenyi Zoltan (zb(at)cybertec(dot)at) wrote:
>> attached is v30, I hope with everything fixed.
> Making progress, certainly.
>
> Given the hack to the API of enable_timeout_after() and the need for
> timeout_reset_base_time(), I'm just going to voice my objection to the
> "accumulated wait time on locks" portion again. I still like the idea
> of a timeout for waiting on relation-level locks, as we acquire those
> all up-front and we'd be able to just set a timeout at the appropriate
> point and then release it when we're past acquiring the relation-level
> locks. Seems like that would be much cleaner.

Yes, it would be cleaner. But this way the total timeout the
statement would wait for locks is N * lock_timeout - <some delta>
in the worst case when all the locks can be acquired just under
the set timeout interval, N being the number of locks that statement
wants to acquire. For some use cases, it's better to have known limit
in the total amount of wait time. But unlike statement_timeout,
with lock_timeout_stmt the statement can still finish after this limit
as it does useful work besides waiting for locks.

> On the other hand, if we're going to go down this route, I'm really
> starting to wonder if it should be the timeout systems responsibility to
> keep track of such accumulated time.

Thinking about it a bit more, I start to agree with it.
It's not likely that any new timeout sources will get added
to proc.c that has anything to do with waiting across multiple locks.
From this POV, this accumulated time can be done by proc.c itself.
But this makes it necessary to reschedule the timer from the
ProcSleep() loop so it increases the number of setitimer() calls.
But with clever coding, the "it_interval" part of struct itimerval
can be used to decrease the number of setitimer calls, so it may
be balanced out.

Another thought is that there is no need to have an extra function
to set the start time for the timeouts. It can be folded into
enable_timeout_after(), enable_timeout_at() and
enable_multiple_timeouts() and it simplifies the API, too.

Since setitimer() has microsecond resolution, I wonder whether
the timeout.c code shouldn't accept that, too. Especially if we want
to keep the per-statement accumulated version for the lock timeout.
Then time to wait that can be represented using int32 would be
about 35.8 minutes at most, we will need to use int64 if the maximum
number of millisecs is to stay as INT_MAX, which I guess it should.

Comments?

> Other than that..
>
>> - List based enable/disable_multiple_timeouts()
> That looks good, like the use of foreach(), etc, but I don't like how
> you've set up delay_ms as a pointer..? Looks to be to allow you to
> initialize the TimeoutParams structs early in proc.c..? Is there
> another reason it needs to be a pointer that I'm missing? Why not build
> the TimeoutParam strcutures in the if() blocks that check if the GUCs
> are set..?

It's fixed in my tree and I also added an Assert to
enable_multiple_timeouts() to make sure all timeout sources
use either TIMEOUT_AT or TIMEOUT_AFTER. Some more
comments were also fixed. But I would like to send out the
new patch after the above questions are settled.

I also want to thank you for your comments. It's good to have
a fresh look on this code.

>> - separate per-lock and per-statement lock_timeout variants
>> - modified comments and documentation
> Thanks.
>
> Stephen

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-27 13:58:15
Message-ID: 20130227135814.GF16142@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Boszormenyi Zoltan (zb(at)cybertec(dot)at) wrote:
> But unlike statement_timeout,
> with lock_timeout_stmt the statement can still finish after this limit
> as it does useful work besides waiting for locks.

It's still entirely possible to get 99% done and then hit that last
tuple that you need a lock on and just tip over the lock_timeout_stmt
limit due to prior waiting and ending up wasting a bunch of work, hence
why I'm not entirely sure that this is that much better than
statement_timeout.

> Thinking about it a bit more, I start to agree with it.
> It's not likely that any new timeout sources will get added
> to proc.c that has anything to do with waiting across multiple locks.
> From this POV, this accumulated time can be done by proc.c itself.
> But this makes it necessary to reschedule the timer from the
> ProcSleep() loop so it increases the number of setitimer() calls.
> But with clever coding, the "it_interval" part of struct itimerval
> can be used to decrease the number of setitimer calls, so it may
> be balanced out.

We're not even going down this code path until we're already waiting on
a lock from someone, right? Not sure that we need to stress out too
much about calling setitimer().

> Another thought is that there is no need to have an extra function
> to set the start time for the timeouts. It can be folded into
> enable_timeout_after(), enable_timeout_at() and
> enable_multiple_timeouts() and it simplifies the API, too.

Right, back to how the API was originally, for the most part, no? :)

> Since setitimer() has microsecond resolution, I wonder whether
> the timeout.c code shouldn't accept that, too. Especially if we want
> to keep the per-statement accumulated version for the lock timeout.
> Then time to wait that can be represented using int32 would be
> about 35.8 minutes at most, we will need to use int64 if the maximum
> number of millisecs is to stay as INT_MAX, which I guess it should.
>
> Comments?

Wouldn't that impact statement_timeout also..? I can definitely see
someone wanting to set that at, say, an hour. If anything, I continue
to feel that going the *other* way makes more sense- keep everything at
millisecond and just floor() down to the ms when we're doing accounting
based on microsecond information. Sure, we might end up waiting a bit
(a very small bit..) longer than we were supposed to, but I hardly see
that as being a major complaint.

Thanks,

Stephen


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, "'Craig Ringer'" <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Amit kapila'" <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-27 18:07:02
Message-ID: 25441.1361988422@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> Tom, can you comment on your thoughts around this notion of an aggregate
> time constraint for waiting on locks? As mentioned upthread, I like the
> idea of having an upper-limit on waiting for relation-level locks, but
> once you're past that, I'm not sure that an aggregate waiting-on-locks
> is any better than the overall statement-level timeout and it seems
> somewhat messy, to me anyway.

I think anything that makes this patch simpler is a good idea. I don't
like any of the accum_time stuff: it complicates the timeout API
unreasonably and slows down existing use cases.

Some other thoughts:

* timeout_reset_base_time() seems like an ugly and unnecessary API wart.
I don't like the timeout_start state variable at all; if you need
several timeouts to be scheduled relative to the exact same starting
point, can't you just do that in a single enable_multiple_timeouts()
call? And I think the optional TMPARAM_ACCUM action is completely
unacceptable, because it supposes that every user of a timeout, now and
in the future, is okay with having their accumulated time reset at the
same point. The whole point of having invented this timeout API is to
serve timeout uses we don't currently foresee, so actions that affect
every timeout seem pretty undesirable.

* I don't care for changing the API of enable_timeout_after when there
is in fact not a single caller using the flags argument (probably
because the only defined flag is too baroque to be useful). If there
were a use case for the "accum" action it'd be better to have a separate
API function for it, probably.

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-27 18:28:46
Message-ID: 20130227182846.GI16142@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> I think anything that makes this patch simpler is a good idea. I don't
> like any of the accum_time stuff: it complicates the timeout API
> unreasonably and slows down existing use cases.

Right, I think we're on the same page there- I had just commented to
Zoltan that tracking the accumulated time shouldn't be the timeout
system's responsibility and that the timout API really shouldn't need
to be changed.

I'm not convinced that the lock-time-accumulation-timeout capability
is really all that valuable in the first place though, but perhaps I'm
in the minority on that.

Thanks,

Stephen


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-27 18:38:44
Message-ID: 512E52B4.6000307@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-02-27 19:07 keltezéssel, Tom Lane írta:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
>> Tom, can you comment on your thoughts around this notion of an aggregate
>> time constraint for waiting on locks? As mentioned upthread, I like the
>> idea of having an upper-limit on waiting for relation-level locks, but
>> once you're past that, I'm not sure that an aggregate waiting-on-locks
>> is any better than the overall statement-level timeout and it seems
>> somewhat messy, to me anyway.
> I think anything that makes this patch simpler is a good idea. I don't
> like any of the accum_time stuff: it complicates the timeout API
> unreasonably and slows down existing use cases.

Perfect. :-)

> Some other thoughts:
>
> * timeout_reset_base_time() seems like an ugly and unnecessary API wart.
> I don't like the timeout_start state variable at all; if you need
> several timeouts to be scheduled relative to the exact same starting
> point, can't you just do that in a single enable_multiple_timeouts()
> call? And I think the optional TMPARAM_ACCUM action is completely
> unacceptable,

If we get rid of the per-statement variant, there is no need for that either.

> because it supposes that every user of a timeout, now and
> in the future, is okay with having their accumulated time reset at the
> same point. The whole point of having invented this timeout API is to
> serve timeout uses we don't currently foresee, so actions that affect
> every timeout seem pretty undesirable.
>
> * I don't care for changing the API of enable_timeout_after when there
> is in fact not a single caller using the flags argument (probably
> because the only defined flag is too baroque to be useful). If there
> were a use case for the "accum" action it'd be better to have a separate
> API function for it, probably.

This way, enable_timeout_after() wouldn't have this extra argument either.

Thanks for your kind words.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-27 19:06:34
Message-ID: 20130227190634.GK16142@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Zoltan,

* Boszormenyi Zoltan (zb(at)cybertec(dot)at) wrote:
> If we get rid of the per-statement variant, there is no need for that either.

For my 2c, I didn't see Tom's comments as saying that we shouldn't have
that capability, just that the implementation was ugly. :)

That said, perhaps we should just drop it for now, get the lock_timeout
piece solid, and then come back to the question about lock_timeout_stmt.

Thanks,

Stephen


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-27 19:38:29
Message-ID: 512E60B5.8020707@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-02-27 20:06 keltezéssel, Stephen Frost írta:
> Zoltan,
>
> * Boszormenyi Zoltan (zb(at)cybertec(dot)at) wrote:
>> If we get rid of the per-statement variant, there is no need for that either.
> For my 2c, I didn't see Tom's comments as saying that we shouldn't have
> that capability, just that the implementation was ugly. :)

But I am happy to drop it. ;-)

> That said, perhaps we should just drop it for now, get the lock_timeout
> piece solid, and then come back to the question about lock_timeout_stmt.

OK, let's do it this way.

>
> Thanks,
>
> Stephen

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-02-28 08:22:18
Message-ID: 512F13BA.1010100@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-02-27 20:38 keltezéssel, Boszormenyi Zoltan írta:
> 2013-02-27 20:06 keltezéssel, Stephen Frost írta:
>> Zoltan,
>>
>> * Boszormenyi Zoltan (zb(at)cybertec(dot)at) wrote:
>>> If we get rid of the per-statement variant, there is no need for that either.
>> For my 2c, I didn't see Tom's comments as saying that we shouldn't have
>> that capability, just that the implementation was ugly. :)
>
> But I am happy to drop it. ;-)
>
>> That said, perhaps we should just drop it for now, get the lock_timeout
>> piece solid, and then come back to the question about lock_timeout_stmt.
>
> OK, let's do it this way.

Dropped the per-statement lock timeout for now. The patch is
now obviously simpler and shorter. I renamed
enable/disable_multiple_timeouts() to simply enable/disable_timeouts()
since the List* argument implies more than one of them and
you need to type less.

The comments and the documentation needs another review,
to make sure I left no traces of the per-statements variant.
I can't see any but I stared at this patch for so long that I can't
be sure anymore.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/

Attachment Content-Type Size
2-lock_timeout-v33.patch text/x-patch 35.6 KB

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Amit kapila'" <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-04 03:28:36
Message-ID: 513414E4.7070102@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 02/27/2013 09:58 PM, Stephen Frost wrote:
> * Boszormenyi Zoltan (zb(at)cybertec(dot)at) wrote:
>> But unlike statement_timeout,
>> with lock_timeout_stmt the statement can still finish after this limit
>> as it does useful work besides waiting for locks.
>
> It's still entirely possible to get 99% done and then hit that last
> tuple that you need a lock on and just tip over the lock_timeout_stmt
> limit due to prior waiting and ending up wasting a bunch of work, hence
> why I'm not entirely sure that this is that much better than
> statement_timeout.

There are questions about whether this is a good idea, and there's still
discussion ongoing. It doesn't look like it's in a state where we can
confidently say "let's include this for 9.3" to me, but I'd like other
viewpoints.

Should we bump this to the next CF? It's clearly still a viable idea,
just possibly not ready yet.

- --
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJRNBTkAAoJELBXNkqjr+S2qd0H+gMdDFmoWLJbqw1IvlopTTiz
LtYr/lkmiRVFFOPgcAMwrDrTzT1AkGIHbkYd0erXqRUNsSrFY9O3FabyQYfo9QG2
5HhvZkmNxf+WFyaqpg1gq/L1pm+2gr0o0N3GabmJTmg9JO7sf1BUBv/EdImaq1CT
lARJXXNC5vI/sVr2P/GpazCzl2120t+ZM9QGyqqqrz6e5t3BjpkGR4Y7MxyVkcfs
hDOpVIoXMDwOZVJTojLHLqeBdjOljRhCjgkqHKXii9ZUBCs5jFGBT/yOQCTwA2xo
YyHbJt+7VJm/lTvG379Q/vXMvIAZkbWtENOwKokwPThlq2HDAAEluZ8U7h5/8i4=
=49I0
-----END PGP SIGNATURE-----


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-04 03:34:49
Message-ID: 20130304033449.GK16142@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Craig,

* Craig Ringer (craig(at)2ndquadrant(dot)com) wrote:
> There are questions about whether this is a good idea, and there's still
> discussion ongoing. It doesn't look like it's in a state where we can
> confidently say "let's include this for 9.3" to me, but I'd like other
> viewpoints.

For my part, I think the straight-up 'lock_timeout' piece, which is in
the latest patch, is in pretty good shape. It's much less invasive and
provides a capability which other RDBMS's have and is reasonably
straight forward.

> Should we bump this to the next CF? It's clearly still a viable idea,
> just possibly not ready yet.

Unless a committer steps up to take on the statement-level lock-wait
timeout, it's not going to get into 9.3, imv. Waiting to add that
(whatever it ends up being) until post-9.3 makes sense to me, but I'd
hate to miss getting the simpler lock_timeout capability into 9.3.

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-06 18:23:24
Message-ID: CA+TgmobQ_ouxVaL1+VVpNsSuE1Ucq+J2UrHrNN_Q8=pHxAtaKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 27, 2013 at 8:58 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Boszormenyi Zoltan (zb(at)cybertec(dot)at) wrote:
>> But unlike statement_timeout,
>> with lock_timeout_stmt the statement can still finish after this limit
>> as it does useful work besides waiting for locks.
>
> It's still entirely possible to get 99% done and then hit that last
> tuple that you need a lock on and just tip over the lock_timeout_stmt
> limit due to prior waiting and ending up wasting a bunch of work, hence
> why I'm not entirely sure that this is that much better than
> statement_timeout.

I tend to agree that this should be based on the length of any
individual lock wait, not the cumulative duration of lock waits.
Otherwise, it seems like it'll be very hard to set this to a
meaningful value. For example, if you set this to 1 minute, and that
means the length of any single wait, then you basically know that
it'll only kick in if there is some other, long-running transaction
that's holding the lock. But if it means the cumulative length of all
waits, it's not so clear, because now you might also have this kick in
if you wait for 100ms on 600 different occasions. In other words,
complex queries that lock or update many tuples may get killed even if
they never wait very long at all for any single lock. That seems like
it will be almost indistinguishable from random, unprincipled query
cancellations.

Now, you could try to work around that by varying the setting based on
the complexity of the query you're about to run, but that seems like a
pain in the neck, to put it mildly. And it might still not give you
the behavior that you want. Personally, I'd think a big part of the
appeal of this is to make sure that you don't hang waiting for a
RELATION level lock for too long before giving up. And for that,
scaling with the complexity of the query would be exactly the wrong
thing to do, even if you could figure out some system for it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-06 18:53:49
Message-ID: 19702.1362596029@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Wed, Feb 27, 2013 at 8:58 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> It's still entirely possible to get 99% done and then hit that last
>> tuple that you need a lock on and just tip over the lock_timeout_stmt
>> limit due to prior waiting and ending up wasting a bunch of work, hence
>> why I'm not entirely sure that this is that much better than
>> statement_timeout.

> I tend to agree that this should be based on the length of any
> individual lock wait, not the cumulative duration of lock waits.
> Otherwise, it seems like it'll be very hard to set this to a
> meaningful value. For example, if you set this to 1 minute, and that
> means the length of any single wait, then you basically know that
> it'll only kick in if there is some other, long-running transaction
> that's holding the lock. But if it means the cumulative length of all
> waits, it's not so clear, because now you might also have this kick in
> if you wait for 100ms on 600 different occasions. In other words,
> complex queries that lock or update many tuples may get killed even if
> they never wait very long at all for any single lock. That seems like
> it will be almost indistinguishable from random, unprincipled query
> cancellations.

Yeah. I'm also unconvinced that there's really much use-case territory
here that statement_timeout doesn't cover well enough. To have a case
that statement-level lock timeout covers and statement_timeout doesn't,
you need to suppose that you know how long the query can realistically
wait for all locks together, but *not* how long it's going to run in the
absence of lock delays. That seems a bit far-fetched, particularly when
thinking of row-level locks, whose cumulative timeout would presumably
need to scale with the number of rows the query will visit.

If statement-level lock timeouts were cheap to add, that would be one
thing; but given that they're complicating the code materially, I think
we need a more convincing argument for them.

regards, tom lane


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-06 18:58:06
Message-ID: 513791BE.1050601@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-03-06 19:53 keltezéssel, Tom Lane írta:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Wed, Feb 27, 2013 at 8:58 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>>> It's still entirely possible to get 99% done and then hit that last
>>> tuple that you need a lock on and just tip over the lock_timeout_stmt
>>> limit due to prior waiting and ending up wasting a bunch of work, hence
>>> why I'm not entirely sure that this is that much better than
>>> statement_timeout.
>> I tend to agree that this should be based on the length of any
>> individual lock wait, not the cumulative duration of lock waits.
>> Otherwise, it seems like it'll be very hard to set this to a
>> meaningful value. For example, if you set this to 1 minute, and that
>> means the length of any single wait, then you basically know that
>> it'll only kick in if there is some other, long-running transaction
>> that's holding the lock. But if it means the cumulative length of all
>> waits, it's not so clear, because now you might also have this kick in
>> if you wait for 100ms on 600 different occasions. In other words,
>> complex queries that lock or update many tuples may get killed even if
>> they never wait very long at all for any single lock. That seems like
>> it will be almost indistinguishable from random, unprincipled query
>> cancellations.
> Yeah. I'm also unconvinced that there's really much use-case territory
> here that statement_timeout doesn't cover well enough. To have a case
> that statement-level lock timeout covers and statement_timeout doesn't,
> you need to suppose that you know how long the query can realistically
> wait for all locks together, but *not* how long it's going to run in the
> absence of lock delays. That seems a bit far-fetched, particularly when
> thinking of row-level locks, whose cumulative timeout would presumably
> need to scale with the number of rows the query will visit.
>
> If statement-level lock timeouts were cheap to add, that would be one
> thing; but given that they're complicating the code materially, I think
> we need a more convincing argument for them.

OK, so it's not wanted. Surprise, surprise, it was already dropped
from the patch. Can you _please_ review the last patch and comment
on it instead of the state of past?

Thanks and best regards,
Zoltán Böszörményi

>
> regards, tom lane
>
>

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, "'Craig Ringer'" <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Amit kapila'" <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-15 17:53:40
Message-ID: 23716.1363370020@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
> [ 2-lock_timeout-v33.patch ]

I looked at this patch a bit. I don't understand why you've chosen to
alter the API of the enable_timeout variants to have a bool result that
says "I didn't bother to process the timeout because it would have fired
immediately". That is not an advantage for any call site that I can
see: it just means that the caller needs an extra, very difficult to
test code path to handle what would normally be handled by a timeout
interrupt. Even if it were a good API choice, you've broken a number of
existing call sites that the patch fails to touch (eg in postmaster.c
and postgres.c). It's not acceptable to define a failure return from
enable_timeout_after and then let callers assume that the failure can't
happen.

Please undo that.

Also, I'm not really enamored of the choice to use List* infrastructure
for enable_timeouts(). That doesn't appear to be especially convenient
for any caller, and what it does do is create memory-leak risks for most
of them (if they forget to free the list cells, perhaps as a result of
an error exit). I think a local-variable array of TimeoutParams[]
structs would serve better for most use-cases.

Another thought is that the PGSemaphoreTimedLock implementations all
share the same bug, which is that if the "condition" callback returns
true immediately after we acquire the semaphore, they will all wrongly
return false indicating that the semaphore wasn't acquired. (BTW,
I do not like either the variable name "condition" or the typedef name
"TimeoutCondition" for something that's actually a callback function
pointer.)

In the same vein, this comment change:

* NOTE: Think not to put any shared-state cleanup after the call to
* ProcSleep, in either the normal or failure path. The lock state must
! * be fully set by the lock grantor, or by CheckDeadLock if we give up
! * waiting for the lock. This is necessary because of the possibility
! * that a cancel/die interrupt will interrupt ProcSleep after someone else
! * grants us the lock, but before we've noticed it. Hence, after granting,
! * the locktable state must fully reflect the fact that we own the lock;
! * we can't do additional work on return.
*
* We can and do use a PG_TRY block to try to clean up after failure, but
* this still has a major limitation: elog(FATAL) can occur while waiting
--- 1594,1606 ----
/*
* NOTE: Think not to put any shared-state cleanup after the call to
* ProcSleep, in either the normal or failure path. The lock state must
! * be fully set by the lock grantor, or by CheckDeadLock or by ProcSleep
! * itself in case a timeout is detected or if we give up waiting for the lock.
! * This is necessary because of the possibility that a cancel/die interrupt
! * will interrupt ProcSleep after someone else grants us the lock, but
! * before we've noticed it. Hence, after granting, the locktable state must
! * fully reflect the fact that we own the lock; we can't do additional work
! * on return.
*

suggests that you've failed to grasp the fundamental race-condition
problem here. ProcSleep can't do cleanup after the fact any more than
its callers can, because otherwise it has a race condition with some
other process deciding to grant it the lock at about the same time its
timeout goes off. I think the changes in ProcSleep that alter the
state-cleanup behavior are just plain wrong, and what you need to do
is make that look more like the existing mechanisms that clean up when
statement timeout occurs.

regards, tom lane


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-16 15:59:27
Message-ID: 514496DF.6070706@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-03-15 18:53 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan<zb(at)cybertec(dot)at> writes:
>> [ 2-lock_timeout-v33.patch ]
> I looked at this patch a bit. I don't understand why you've chosen to
> alter the API of the enable_timeout variants to have a bool result that
> says "I didn't bother to process the timeout because it would have fired
> immediately".

I don't know either...

> That is not an advantage for any call site that I can
> see: it just means that the caller needs an extra, very difficult to
> test code path to handle what would normally be handled by a timeout
> interrupt. Even if it were a good API choice, you've broken a number of
> existing call sites that the patch fails to touch (eg in postmaster.c
> and postgres.c). It's not acceptable to define a failure return from
> enable_timeout_after and then let callers assume that the failure can't
> happen.
>
> Please undo that.

Done.

> Also, I'm not really enamored of the choice to use List* infrastructure
> for enable_timeouts(). That doesn't appear to be especially convenient
> for any caller, and what it does do is create memory-leak risks for most
> of them (if they forget to free the list cells, perhaps as a result of
> an error exit). I think a local-variable array of TimeoutParams[]
> structs would serve better for most use-cases.

Changed. However, the first member of the structure is
"TimeoutId id" and a sensible end-of-array value can be -1.
Some versions of GCC (maybe other compilers, too) complain
if a constant is assigned to an enum which is outside of the
declared values of the enum. So I added a "NO_TIMEOUT = -1"
to enum TimeoutId. Comments?

> Another thought is that the PGSemaphoreTimedLock implementations all
> share the same bug, which is that if the "condition" callback returns
> true immediately after we acquire the semaphore, they will all wrongly
> return false indicating that the semaphore wasn't acquired.

You are right. Fixed.

> (BTW,
> I do not like either the variable name "condition" or the typedef name
> "TimeoutCondition" for something that's actually a callback function
> pointer.)

How about "TimeoutCallback" and "callback_fn"?

> In the same vein, this comment change:
>
> * NOTE: Think not to put any shared-state cleanup after the call to
> * ProcSleep, in either the normal or failure path. The lock state must
> ! * be fully set by the lock grantor, or by CheckDeadLock if we give up
> ! * waiting for the lock. This is necessary because of the possibility
> ! * that a cancel/die interrupt will interrupt ProcSleep after someone else
> ! * grants us the lock, but before we've noticed it. Hence, after granting,
> ! * the locktable state must fully reflect the fact that we own the lock;
> ! * we can't do additional work on return.
> *
> * We can and do use a PG_TRY block to try to clean up after failure, but
> * this still has a major limitation: elog(FATAL) can occur while waiting
> --- 1594,1606 ----
> /*
> * NOTE: Think not to put any shared-state cleanup after the call to
> * ProcSleep, in either the normal or failure path. The lock state must
> ! * be fully set by the lock grantor, or by CheckDeadLock or by ProcSleep
> ! * itself in case a timeout is detected or if we give up waiting for the lock.
> ! * This is necessary because of the possibility that a cancel/die interrupt
> ! * will interrupt ProcSleep after someone else grants us the lock, but
> ! * before we've noticed it. Hence, after granting, the locktable state must
> ! * fully reflect the fact that we own the lock; we can't do additional work
> ! * on return.
> *
>
> suggests that you've failed to grasp the fundamental race-condition
> problem here. ProcSleep can't do cleanup after the fact any more than
> its callers can, because otherwise it has a race condition with some
> other process deciding to grant it the lock at about the same time its
> timeout goes off. I think the changes in ProcSleep that alter the
> state-cleanup behavior are just plain wrong, and what you need to do
> is make that look more like the existing mechanisms that clean up when
> statement timeout occurs.

This was the most enlightening comment up to now.
It seems no one else understood the timer code but you...
Thanks very much.

I hope the way I did it is right. I factored out the core of
StatementCancelHandler() into a common function that can
be used by the lock_timeout interrupt as its timeout callback
function. Now the code doesn't need PGSemaphoreTimedLock().

While I was thinking about how this thing works, I recognized
that I also need to set the timeout indicator to false after checking
it in ProcessInterrupts().

The reason is that it's needed is this scenario:
1. lock_timeout is set and the transaction throws its error
2. lock_timeout is unset before the next transaction
3. the user presses Ctrl-C during the next transaction
In this case, the second transaction would report the
lock timeout error, since the indicator was still set.

The other way would be to call disable_all_timeouts(false)
unconditionally in ProcessInterrupts() but setting only the indicator
is faster and it doesn't interfere with unknown registered timeout.
Also, the reason is that it's disable_timeout_indicator() instead of
a generic set_timeout_indicator() is that the generic one may be
abused to produce false positives.

Best regards,
Zoltán Böszörményi

> regards, tom lane
>
>

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web:http://www.postgresql-support.de
http://www.postgresql.at/

Attachment Content-Type Size
2-lock_timeout-v36.patch text/x-patch 25.2 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, "'Craig Ringer'" <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Amit kapila'" <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-16 16:42:51
Message-ID: 22733.1363452171@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
> 2013-03-15 18:53 keltezssel, Tom Lane rta:
>> Also, I'm not really enamored of the choice to use List* infrastructure
>> for enable_timeouts().

> Changed. However, the first member of the structure is
> "TimeoutId id" and a sensible end-of-array value can be -1.
> Some versions of GCC (maybe other compilers, too) complain
> if a constant is assigned to an enum which is outside of the
> declared values of the enum. So I added a "NO_TIMEOUT = -1"
> to enum TimeoutId. Comments?

I was thinking more of having array pointer and count parameters, ie
enable_timeouts(TimeoutParams *timeouts, int n)
I guess we could do it with sentinels instead but not sure that's
better.

The sentinel approach might be all right if there was another reason
to have an "invalid" value in the enum, but I'm not seeing one ATM.

> I hope the way I did it is right. I factored out the core of
> StatementCancelHandler() into a common function that can
> be used by the lock_timeout interrupt as its timeout callback
> function. Now the code doesn't need PGSemaphoreTimedLock().

Hm, not needing PGSemaphoreTimedLock at all is an improvement for
sure. Don't have time right now to look closer though.

regards, tom lane


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-16 17:17:00
Message-ID: 5144A90C.2050908@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-03-16 17:42 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
>> 2013-03-15 18:53 keltezéssel, Tom Lane írta:
>>> Also, I'm not really enamored of the choice to use List* infrastructure
>>> for enable_timeouts().
>> Changed. However, the first member of the structure is
>> "TimeoutId id" and a sensible end-of-array value can be -1.
>> Some versions of GCC (maybe other compilers, too) complain
>> if a constant is assigned to an enum which is outside of the
>> declared values of the enum. So I added a "NO_TIMEOUT = -1"
>> to enum TimeoutId. Comments?
> I was thinking more of having array pointer and count parameters, ie
> enable_timeouts(TimeoutParams *timeouts, int n)
> I guess we could do it with sentinels instead but not sure that's
> better.
>
> The sentinel approach might be all right if there was another reason
> to have an "invalid" value in the enum, but I'm not seeing one ATM.

Stephen Frost was against the array pointer/count variant,
it was done that way earlier. Let me redo it again. :-)

>
>> I hope the way I did it is right. I factored out the core of
>> StatementCancelHandler() into a common function that can
>> be used by the lock_timeout interrupt as its timeout callback
>> function. Now the code doesn't need PGSemaphoreTimedLock().
> Hm, not needing PGSemaphoreTimedLock at all is an improvement for
> sure. Don't have time right now to look closer though.
>
> regards, tom lane
>
>

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/

Attachment Content-Type Size
2-lock_timeout-v37.patch text/x-patch 24.9 KB

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-16 17:48:35
Message-ID: 20130316174835.GQ4361@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Boszormenyi Zoltan (zb(at)cybertec(dot)at) wrote:
> Stephen Frost was against the array pointer/count variant,
> it was done that way earlier. Let me redo it again. :-)

I still don't particularly like the array approach, and see the
array+count approach as worse (seems like a higher chance that the count
will end up being wrong at some point than having an array termination
identifier). I still like the List approach, as that builds on a
structure we've already got and can take advantage of the existing
infrastructure. but Tom's got a good point regarding the potential for
memory leaks with that solution.

I havn't had a chance to look, but I would have expected the Lists for
these to be allocated in a per-statement context, which would address
the memory leak issue. Perhaps that isn't possible though. I agree
that the List construct doesn't particularly help the callers, though I
do think it makes the enable_timeouts() function cleaner.

Thanks,

Stephen


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, "'Craig Ringer'" <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Amit kapila'" <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-16 18:04:25
Message-ID: 24166.1363457065@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> * Boszormenyi Zoltan (zb(at)cybertec(dot)at) wrote:
>> Stephen Frost was against the array pointer/count variant,
>> it was done that way earlier. Let me redo it again. :-)

> I still don't particularly like the array approach, and see the
> array+count approach as worse (seems like a higher chance that the count
> will end up being wrong at some point than having an array termination
> identifier). I still like the List approach, as that builds on a
> structure we've already got and can take advantage of the existing
> infrastructure. but Tom's got a good point regarding the potential for
> memory leaks with that solution.

> I havn't had a chance to look, but I would have expected the Lists for
> these to be allocated in a per-statement context, which would address
> the memory leak issue. Perhaps that isn't possible though. I agree
> that the List construct doesn't particularly help the callers, though I
> do think it makes the enable_timeouts() function cleaner.

I'm not sure about that. I think locks will typically get taken in
query-lifespan or transaction-lifespan contexts, so that repeated leaks
would be significant. Possibly my fear of leaks during error exits is
unfounded, but I think forgetting to pfree the list cells in normal
execution would be a problem.

Anyway, the bottom line for me was that the List way didn't seem to be
either convenient for the callers or especially efficient, because of
the need to palloc list cells and then remember to pfree them again.

Another way that we perhaps should consider is to follow the example of
XLogInsert and use internally-threaded lists that are typically stored
in local arrays in the callers. I've never thought that way was
especially beautiful, but it does have the advantage of being an idiom
that's already in use in other low-level code.

On the whole though, I don't see anything wrong with pointer-and-count.
I don't really believe that there's ever going to be a need to enable
more than a couple of timeouts simultaneously, so I don't want an overly
complicated data structure for it.

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-16 18:26:34
Message-ID: 20130316182634.GR4361@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> On the whole though, I don't see anything wrong with pointer-and-count.
> I don't really believe that there's ever going to be a need to enable
> more than a couple of timeouts simultaneously, so I don't want an overly
> complicated data structure for it.

Alright, fair enough.

Zoltan, sorry for the back-and-forth Zoltan and thanks for being
persistent; I'd really like to see this capability added.

Thanks again,

Stephen


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-17 00:35:40
Message-ID: 20130317003540.GA3686@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:

> Another way that we perhaps should consider is to follow the example of
> XLogInsert and use internally-threaded lists that are typically stored
> in local arrays in the callers. I've never thought that way was
> especially beautiful, but it does have the advantage of being an idiom
> that's already in use in other low-level code.

FWIW you could use an slist from ilist.c. It means each node would need
a "next" pointer, but there's no separately allocated list cell.

There are many places that could use slist/dlist. For instance while
reading the SET PERSISTENT patch I noticed it has head and tail pointers
being passed all over the place, which looks rather ugly.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, "'Craig Ringer'" <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Amit kapila'" <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-17 03:48:16
Message-ID: 1849.1363492096@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
> [ 2-lock_timeout-v37.patch ]

Applied after a fair amount of additional hacking.

I was disappointed to find that the patch introduced a new race
condition into timeout.c, or at least broke a safety factor that had
been there. The argument why enable_timeout() could skip disabling
the timer interrupt on entry, if num_active_timeouts is zero, doesn't
work for enable_timeouts(): as soon as you've incremented
num_active_timeouts for the first new timeout, the signal handler would
mess around with the data structure if it were to fire. What I did
about that was to modify disable_alarm() to forcibly disable the
interrupt if we're adding multiple timeouts in enable_timeouts(), even
if we think no interrupt is pending. This might be overly paranoid,
but since all of this is new code for 9.3 and hasn't been through any
beta testing, I felt it best to preserve that safety feature. We can
revisit it later if it proves to be an issue. (It's conceivable for
instance that we could avoid incrementing the stored value of
num_active_timeouts until we're done adding all the new timeouts;
but that seemed pretty messy.) For the current usage pattern I'm not
too sure that it matters anyway: a savings is only possible if you
have enabled lock_timeout but not statement_timeout, and I'm doubtful
that that will be a common use-case.

I also rearranged the handling of the LOCK_TIMEOUT interrupt some more,
since I didn't see a need for it to be different from STATEMENT_TIMEOUT,
and got rid of some non-C89 coding practices that didn't seem to me to
be improving readability anyway.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, "'Craig Ringer'" <craig(at)2ndquadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Amit kapila'" <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-17 03:53:50
Message-ID: 2004.1363492430@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Tom Lane wrote:
>> Another way that we perhaps should consider is to follow the example of
>> XLogInsert and use internally-threaded lists that are typically stored
>> in local arrays in the callers. I've never thought that way was
>> especially beautiful, but it does have the advantage of being an idiom
>> that's already in use in other low-level code.

> FWIW you could use an slist from ilist.c. It means each node would need
> a "next" pointer, but there's no separately allocated list cell.

Yeah, if the usage patterns were more complicated it'd be worth thinking
about that. Right now there's nothing more complex than this:

*************** ResolveRecoveryConflictWithBufferPin(voi
*** 428,435 ****
* Wake up at ltime, and check for deadlocks as well if we will be
* waiting longer than deadlock_timeout
*/
! enable_timeout_after(STANDBY_DEADLOCK_TIMEOUT, DeadlockTimeout);
! enable_timeout_at(STANDBY_TIMEOUT, ltime);
}

/* Wait to be signaled by UnpinBuffer() */
--- 428,442 ----
* Wake up at ltime, and check for deadlocks as well if we will be
* waiting longer than deadlock_timeout
*/
! EnableTimeoutParams timeouts[2];
!
! timeouts[0].id = STANDBY_TIMEOUT;
! timeouts[0].type = TMPARAM_AT;
! timeouts[0].fin_time = ltime;
! timeouts[1].id = STANDBY_DEADLOCK_TIMEOUT;
! timeouts[1].type = TMPARAM_AFTER;
! timeouts[1].delay_ms = DeadlockTimeout;
! enable_timeouts(timeouts, 2);
}

and you really can't improve that by complicating the data structure.

regards, tom lane


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-17 08:00:50
Message-ID: 51457832.9050107@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-03-17 04:48 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
>> [ 2-lock_timeout-v37.patch ]
> Applied after a fair amount of additional hacking.

Thank you, thank you, thank you! :-)

> I was disappointed to find that the patch introduced a new race
> condition into timeout.c, or at least broke a safety factor that had
> been there. The argument why enable_timeout() could skip disabling
> the timer interrupt on entry, if num_active_timeouts is zero, doesn't
> work for enable_timeouts(): as soon as you've incremented
> num_active_timeouts for the first new timeout, the signal handler would
> mess around with the data structure if it were to fire. What I did
> about that was to modify disable_alarm() to forcibly disable the
> interrupt if we're adding multiple timeouts in enable_timeouts(), even
> if we think no interrupt is pending. This might be overly paranoid,
> but since all of this is new code for 9.3 and hasn't been through any
> beta testing, I felt it best to preserve that safety feature. We can
> revisit it later if it proves to be an issue. (It's conceivable for
> instance that we could avoid incrementing the stored value of
> num_active_timeouts until we're done adding all the new timeouts;
> but that seemed pretty messy.)

Your reasoning seems to be correct. However, if we take it to the
extreme, enable_timeout_at/enable_timeout_after/enable_timeouts
should also disable the interrupt handler at the beginning of the
function and enabled at the end with pqsignal(). An evil soul may
send SIGALRM externally to the backend processes at the proper
moment and create a race condition while enable_timeout*() is in
progress and the itimer is about to trigger at the same time (but
doesn't since it was disabled).

> For the current usage pattern I'm not
> too sure that it matters anyway: a savings is only possible if you
> have enabled lock_timeout but not statement_timeout, and I'm doubtful
> that that will be a common use-case.

I am not too sure about it. With lock_timeout in place, code migrated
from Informix, MSSQL, etc. can have the same semantic behaviour.

>
> I also rearranged the handling of the LOCK_TIMEOUT interrupt some more,
> since I didn't see a need for it to be different from STATEMENT_TIMEOUT,
> and got rid of some non-C89 coding practices that didn't seem to me to
> be improving readability anyway.

Thanks for that, too. I was too blind to see that choice, even after
thinking about why the statement_timeout cannot be done from
SIGALRM context and why does the code need to also send SIGINT
to the process group. (To kill children, too, like scripts executed via
system()...)

Thanks again,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, "'Craig Ringer'" <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Amit kapila'" <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-17 15:07:12
Message-ID: 18700.1363532832@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
> 2013-03-17 04:48 keltezssel, Tom Lane rta:
>> [ worries about stray SIGALRM events ]

> Your reasoning seems to be correct. However, if we take it to the
> extreme, enable_timeout_at/enable_timeout_after/enable_timeouts
> should also disable the interrupt handler at the beginning of the
> function and enabled at the end with pqsignal(). An evil soul may
> send SIGALRM externally to the backend processes at the proper
> moment and create a race condition while enable_timeout*() is in
> progress and the itimer is about to trigger at the same time (but
> doesn't since it was disabled).

Well, a malefactor with the ability to send signals to a backend
process could also send SIGKILL, or any number of other signals that
would mess things up. I'm not too concerned about that scenario.
I *am* concerned about leftover timer events, both because this code
isn't very well tested, and because third-party code loaded into the
backend (think pl/perl for instance) could easily set up timer events
we weren't expecting.

It suddenly occurs to me though that there's more than one way to skin
this cat. We could easily add another static flag variable called
"sigalrm_allowed" or some such, and have the signal handler test that
and immediately return without doing anything if it's off. Clearing
and setting such a variable would be a lot cheaper than an extra
setitimer call, as well as more robust since it would protect us from
all sources of SIGALRM not just ITIMER_REAL.

regards, tom lane


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-17 19:06:13
Message-ID: 51461425.5090501@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-03-17 16:07 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
>> 2013-03-17 04:48 keltezéssel, Tom Lane írta:
>>> [ worries about stray SIGALRM events ]
>> Your reasoning seems to be correct. However, if we take it to the
>> extreme, enable_timeout_at/enable_timeout_after/enable_timeouts
>> should also disable the interrupt handler at the beginning of the
>> function and enabled at the end with pqsignal(). An evil soul may
>> send SIGALRM externally to the backend processes at the proper
>> moment and create a race condition while enable_timeout*() is in
>> progress and the itimer is about to trigger at the same time (but
>> doesn't since it was disabled).
> Well, a malefactor with the ability to send signals to a backend
> process could also send SIGKILL, or any number of other signals that
> would mess things up. I'm not too concerned about that scenario.
> I *am* concerned about leftover timer events, both because this code
> isn't very well tested, and because third-party code loaded into the
> backend (think pl/perl for instance) could easily set up timer events
> we weren't expecting.

I see.

> It suddenly occurs to me though that there's more than one way to skin
> this cat. We could easily add another static flag variable called
> "sigalrm_allowed" or some such, and have the signal handler test that
> and immediately return without doing anything if it's off. Clearing
> and setting such a variable would be a lot cheaper than an extra
> setitimer call, as well as more robust since it would protect us from
> all sources of SIGALRM not just ITIMER_REAL.

Something like the attached patch?

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/

Attachment Content-Type Size
save-one-setitimer.patch text/x-patch 4.5 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, "'Craig Ringer'" <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Amit kapila'" <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-18 02:52:11
Message-ID: 21445.1363575131@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
> 2013-03-17 16:07 keltezssel, Tom Lane rta:
>> It suddenly occurs to me though that there's more than one way to skin
>> this cat. We could easily add another static flag variable called
>> "sigalrm_allowed" or some such, and have the signal handler test that
>> and immediately return without doing anything if it's off. Clearing
>> and setting such a variable would be a lot cheaper than an extra
>> setitimer call, as well as more robust since it would protect us from
>> all sources of SIGALRM not just ITIMER_REAL.

> Something like the attached patch?

Well, something like that, but it still needed some improvements. In
particular I thought it best to leave the signal handler still releasing
the procLatch unconditionally --- that behavior is independent of what
this module is doing. Also you seem to have some odd ideas about what
do-while will accomplish. AFAIK, in this usage it's purely a syntactic
trick without much semantic content. It's the marking of the variable
as "volatile" that counts for telling the compiler not to re-order
operations.

Updated and committed.

regards, tom lane


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-18 05:22:42
Message-ID: 5146A4A2.9090801@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-03-18 03:52 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
>> 2013-03-17 16:07 keltezéssel, Tom Lane írta:
>>> It suddenly occurs to me though that there's more than one way to skin
>>> this cat. We could easily add another static flag variable called
>>> "sigalrm_allowed" or some such, and have the signal handler test that
>>> and immediately return without doing anything if it's off. Clearing
>>> and setting such a variable would be a lot cheaper than an extra
>>> setitimer call, as well as more robust since it would protect us from
>>> all sources of SIGALRM not just ITIMER_REAL.
>> Something like the attached patch?
> Well, something like that, but it still needed some improvements. In
> particular I thought it best to leave the signal handler still releasing
> the procLatch unconditionally --- that behavior is independent of what
> this module is doing. Also you seem to have some odd ideas about what
> do-while will accomplish. AFAIK, in this usage it's purely a syntactic
> trick without much semantic content. It's the marking of the variable
> as "volatile" that counts for telling the compiler not to re-order
> operations.

The volatile marking shouldn't even be necessary there.
The signal handler doesn't writes to it, only the main code.
I just put it there saying "why not?" to myself.
IIRC, volatile is needed if both the signal handler and the
main code changes the same variable.

I got the reordering idea from here:
http://yarchive.net/comp/linux/compiler_barriers.html

Thanks for committing,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-18 05:29:37
Message-ID: 5146A641.1050505@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-03-18 06:22 keltezéssel, Boszormenyi Zoltan írta:
> 2013-03-18 03:52 keltezéssel, Tom Lane írta:
>> Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
>>> 2013-03-17 16:07 keltezéssel, Tom Lane írta:
>>>> It suddenly occurs to me though that there's more than one way to skin
>>>> this cat. We could easily add another static flag variable called
>>>> "sigalrm_allowed" or some such, and have the signal handler test that
>>>> and immediately return without doing anything if it's off. Clearing
>>>> and setting such a variable would be a lot cheaper than an extra
>>>> setitimer call, as well as more robust since it would protect us from
>>>> all sources of SIGALRM not just ITIMER_REAL.
>>> Something like the attached patch?
>> Well, something like that, but it still needed some improvements. In
>> particular I thought it best to leave the signal handler still releasing
>> the procLatch unconditionally --- that behavior is independent of what
>> this module is doing. Also you seem to have some odd ideas about what
>> do-while will accomplish. AFAIK, in this usage it's purely a syntactic
>> trick without much semantic content. It's the marking of the variable
>> as "volatile" that counts for telling the compiler not to re-order
>> operations.
>
> The volatile marking shouldn't even be necessary there.
> The signal handler doesn't writes to it, only the main code.
> I just put it there saying "why not?" to myself.
> IIRC, volatile is needed if both the signal handler and the
> main code changes the same variable.

Also, since the the variable assignment doesn't depend on other code
in the same function (or vice-versa) the compiler is still free to
reorder it.

Volatile is about not caching the variable in a CPU register since
it's "volatile"...

>
> I got the reordering idea from here:
> http://yarchive.net/comp/linux/compiler_barriers.html
>
> Thanks for committing,
> Zoltán Böszörményi
>

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, "'Craig Ringer'" <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Amit kapila'" <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-18 05:47:30
Message-ID: 11304.1363585650@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
>> The volatile marking shouldn't even be necessary there.
>> The signal handler doesn't writes to it, only the main code.

Well, (a) that's not the case in the patch as committed, and (b) even if
it were true, the volatile marking is still *necessary*, because that's
what tells the compiler it can't optimize away changes to the variable,
say on the grounds of there being another store with no intervening
fetches in the main-line code. Without that, a compiler that had
aggressively inlined the smaller functions could well deduce that the
disable_alarm() assignment was useless.

> Also, since the the variable assignment doesn't depend on other code
> in the same function (or vice-versa) the compiler is still free to
> reorder it.
> Volatile is about not caching the variable in a CPU register since
> it's "volatile"...

I don't believe you understand what volatile is about at all. Go read
the C standard: it's about requiring objects' values to actually match
the nominal program-specified values at sequence points. A more
accurate heuristic is that volatile tells the compiler there may be
other forces reading or writing the variable besides the ones it can see
in the current function's source code, and so it can't drop or reorder
accesses to the variable.

regards, tom lane


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, 'Craig Ringer' <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, 'Ants Aasma' <ants(at)cybertec(dot)at>, 'PostgreSQL Hackers' <pgsql-hackers(at)postgresql(dot)org>, 'Amit kapila' <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-18 10:40:48
Message-ID: 5146EF30.2040900@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-03-18 06:47 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
>>> The volatile marking shouldn't even be necessary there.
>>> The signal handler doesn't writes to it, only the main code.
> Well, (a) that's not the case in the patch as committed, and (b) even if
> it were true, the volatile marking is still *necessary*, because that's
> what tells the compiler it can't optimize away changes to the variable,
> say on the grounds of there being another store with no intervening
> fetches in the main-line code. Without that, a compiler that had
> aggressively inlined the smaller functions could well deduce that the
> disable_alarm() assignment was useless.
>
>> Also, since the the variable assignment doesn't depend on other code
>> in the same function (or vice-versa) the compiler is still free to
>> reorder it.
>> Volatile is about not caching the variable in a CPU register since
>> it's "volatile"...
> I don't believe you understand what volatile is about at all. Go read
> the C standard: it's about requiring objects' values to actually match
> the nominal program-specified values at sequence points. A more
> accurate heuristic is that volatile tells the compiler there may be
> other forces reading or writing the variable besides the ones it can see
> in the current function's source code, and so it can't drop or reorder
> accesses to the variable.
>
> regards, tom lane

After reading up on a lot of volatile and memory barriers,
I am still not totally convinced.

For the record, sig_atomit_t is a plain int without any special
treatment from the compiler. It's atomic by nature on every 32-bit
and 64-bit CPU.

How about the attached patch over current GIT? In other words,
why I am wrong with this idea?

The problem that may arise if it's wrong is that transactions are
left waiting for the lock when the interrupt triggers and the variable
is still seen as false from the interrupt handler. But this doesn't happen.

FWIW, this small patch seems to give a 0,7 percent performance
boost for my settings:

shared_buffers = 256MB
work_mem = 8MB
effective_io_concurrency = 8
wal_level = hot_standby
checkpoint_segments = 64
random_page_cost = 1.8

Everything else is the default. I tested the patch on a 8-core
AMD FX-8120 CPU with this pgbench script below. Basically, it's
the default transaction prepended with "SET lock_timeout = 1;"
I have used the attached quick-and-dirty patch to pgbench to
ignore SQL errors coming from statements. "-s 100" was used
to initialize the test database.

\set nbranches 1 * :scale
\set ntellers 10 * :scale
\set naccounts 100000 * :scale
\setrandom aid 1 :naccounts
\setrandom bid 1 :nbranches
\setrandom tid 1 :ntellers
\setrandom delta -5000 5000
BEGIN;
SET lock_timeout = 1;
UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid,
:delta, CURRENT_TIMESTAMP);
END;

Results of "pgbench -c 32 -j 32 -t 10000 -e -f script.sql" are
for the GIT version:

tps = 3366.844023 (including connections establishing)
tps = 3367.645454 (excluding connections establishing)

tps = 3379.784707 (including connections establishing)
tps = 3380.622317 (excluding connections establishing)

tps = 3385.198238 (including connections establishing)
tps = 3386.132433 (excluding connections establishing)

and with the barrier patch:

tps = 3412.799044 (including connections establishing)
tps = 3413.670832 (excluding connections establishing)

tps = 3389.796287 (including connections establishing)
tps = 3390.602187 (excluding connections establishing)

tps = 3405.924548 (including connections establishing)
tps = 3406.726997 (excluding connections establishing)

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/

Attachment Content-Type Size
use-barrier-not-volatile.patch text/x-patch 1.0 KB
pgbench-ignore-errors.patch text/x-patch 1.8 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, "'Craig Ringer'" <craig(at)2ndQuadrant(dot)com>, 'Hans-Jürgen Schönig' <hs(at)cybertec(dot)at>, "'Ants Aasma'" <ants(at)cybertec(dot)at>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Amit kapila'" <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-18 14:09:06
Message-ID: 4639.1363615746@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
> How about the attached patch over current GIT? In other words,
> why I am wrong with this idea?

Because it's wrong. Removing "volatile" means that the compiler is
permitted to optimize away stores (and fetches!) on the basis of their
being unnecessary according to straight-line analysis of the code.
Write barriers don't fix that, they only say that stores that the
compiler chooses to issue at all have to be ordered a certain way.

(There are also pretty serious questions as to whether pg_write_barrier
can be trusted yet, but it doesn't really matter here. Removing
volatile would break the code.)

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-21 18:24:10
Message-ID: CA+TgmoZLBDEJ+x=g_DPoNVAuKSJR8CUuq+=DSaqL7_rZpi9q+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 18, 2013 at 10:09 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
>> How about the attached patch over current GIT? In other words,
>> why I am wrong with this idea?
>
> Because it's wrong. Removing "volatile" means that the compiler is
> permitted to optimize away stores (and fetches!) on the basis of their
> being unnecessary according to straight-line analysis of the code.
> Write barriers don't fix that, they only say that stores that the
> compiler chooses to issue at all have to be ordered a certain way.

I don't think this is correct. The read and write barriers as
implemented are designed to function as compiler barriers also, just
as they do in the Linux kernel and every other piece of software I've
found that implements anything remotely like this, with the lone
exception of PostgreSQL. In PostgreSQL, spinlock acquisition and
release are defined as CPU barriers but not a compiler barrier, and
this necessitates extensive use of volatile all over the code base
which would be unnecessary if we did this the way it's done in Linux
and elsewhere.

However, Zoltan's patch probably isn't right either. First, a write
barrier isn't ever the correct solution when there's only one process
involved - which is the case here, because the relevant variables are
in backend-private memory. A compiler barrier - which is generally
far cheaper - might be the right thing, though. However, the position
of the barriers in his proposed patch is suspect. As implemented,
he's proposing to force each change to alarm_enabled to be scheduled
by the compiler before any other writes to memory are completed. But
there's no guard against the compiler sliding the change backward,
only forward. Now maybe that doesn't matter, if we're only concerned
about the timing of updates of alarm_enabled relative to other updates
of that same variable. But if there are multiple variables involved,
then it matters.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-22 00:16:39
Message-ID: 17142.1363911399@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Mon, Mar 18, 2013 at 10:09 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Because it's wrong. Removing "volatile" means that the compiler is
>> permitted to optimize away stores (and fetches!) on the basis of their
>> being unnecessary according to straight-line analysis of the code.
>> Write barriers don't fix that, they only say that stores that the
>> compiler chooses to issue at all have to be ordered a certain way.

> I don't think this is correct. The read and write barriers as
> implemented are designed to function as compiler barriers also, just
> as they do in the Linux kernel and every other piece of software I've
> found that implements anything remotely like this, with the lone
> exception of PostgreSQL. In PostgreSQL, spinlock acquisition and
> release are defined as CPU barriers but not a compiler barrier, and
> this necessitates extensive use of volatile all over the code base
> which would be unnecessary if we did this the way it's done in Linux
> and elsewhere.

I think you're just as mistaken as Zoltan. Barriers enforce ordering
of operations, not whether an operation occurs at all.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-22 14:29:15
Message-ID: CA+TgmobkMbQVCtKNYgXzA=g+OEteAkYDacCnM9XpeX332-yV9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Mar 21, 2013 at 8:16 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Mon, Mar 18, 2013 at 10:09 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Because it's wrong. Removing "volatile" means that the compiler is
>>> permitted to optimize away stores (and fetches!) on the basis of their
>>> being unnecessary according to straight-line analysis of the code.
>>> Write barriers don't fix that, they only say that stores that the
>>> compiler chooses to issue at all have to be ordered a certain way.
>
>> I don't think this is correct. The read and write barriers as
>> implemented are designed to function as compiler barriers also, just
>> as they do in the Linux kernel and every other piece of software I've
>> found that implements anything remotely like this, with the lone
>> exception of PostgreSQL. In PostgreSQL, spinlock acquisition and
>> release are defined as CPU barriers but not a compiler barrier, and
>> this necessitates extensive use of volatile all over the code base
>> which would be unnecessary if we did this the way it's done in Linux
>> and elsewhere.
>
> I think you're just as mistaken as Zoltan. Barriers enforce ordering
> of operations, not whether an operation occurs at all.

Surely not. Suppose the user does this:

some_global_var = 1;
some_function();
some_global_var = 2;

I hope we can both agree that any compiler which thinks it doesn't
need to store 1 in some_global_var is completely nuts, because
some_function() might perform any arbitrary computation, including one
that depends on some_global_var being 1 rather than whatever value it
had before that. Of course, if a global optimizer can prove that
some_function() can't do anything that can possibly care about
some_global_var, then the store could be omitted, but not otherwise.
Should the compiler omit the store and should there then be a bug in
the program, we wouldn't say "oh, some_global_var ought to be declared
volatile". We would say "that compiler is buggy".

Now, conversely, in this situation, it seems to me (and I think you'll
agree) that the compiler could be forgiven for omitting one of the
stores:

some_global_var = 1;
local_var = 42;
some_global_var = 2;

If we declare some_global_var as volatile, it will force the compiler
to perform both stores regardless of what the optimizer thinks, and
moreover, the second store is required to happen after the first one,
because the definition of volatile is that volatile references are
globally sequenced with respect TO EACH OTHER. However, the store to
local_var could be moved around with respect to the other two or
omitted altogether unless local_var is also declared volatile.

Now, consider this:

some_global_var = 1;
pg_compiler_barrier(); /* or in Linux, barrier() */
some_global_var = 2;

The compiler barrier is exactly equivalent to the unknown function in
the first example, except that no function is actually called. The
semantics are precisely that the compiler must assume that, at the
point the barrier intervenes, an unknown operation will occur which
may depend on the contents of any word in memory and which may modify
any word in memory. Thus, the compiler may not postpone stores
requested before the barrier until after the barrier on the grounds
that the values will be overwritten after the barrier; and if any
global variables have been loaded into registers before the barrier it
must be assumed that, after the barrier, the registers may no longer
match those global variables.

It is true that any barrier, including a compiler barrier, serves only
to separate instructions. But I don't believe it follows in any way
that the barrier therefore prohibits reordering operations but not
omitting them altogether. Such a definition would have no practical
utility. The compiler is not free to willy-nilly leave out things
that the user asks it to do any more than it is free to willy-nilly
reorder them. If it were, programming would be chaos, and everything
would have to be volatile. What the compiler is free to do is to
reorder and omit instructions where the programmer won't, in a
single-thread execution context, be able to notice the difference.
And a compiler barrier is an explicit notification that, in essence,
the programmer will notice if things are not just the way he wrote
them when that point in the code is reached.

To see the difference between this and volatile, consider the following:

a = 1;
b = 1;
c = 1;
a = 2;
b = 2;
c = 2;

Absent any special precautions, the compiler may well optimize the
first set of stores away completely and perform the second set in any
order it likes. If we make all the variables volatile, it will do all
6 stores in precisely the order specified, omitting nothing and
reordering nothing. If we instead stick a compiler barrier just after
the assignment "c = 1", then the compiler must store 1 in all three
variables (in any order that it likes), and then store 2 in all three
variables (in any order that it likes). The "barrier" essentially
divides up the code into chunks and requires that those chunks be
optimized independently by the compiler without knowledge of what
earlier or later chunks are doing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Greg Stark <stark(at)mit(dot)edu>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Stephen Frost <sfrost(at)snowman(dot)net>, Hari Babu <haribabu(dot)kommi(at)huawei(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Strange Windows problem, lock_timeout test request
Date: 2013-03-22 16:49:29
Message-ID: CAM-w4HN9mMEudBVtGUXi9BYj=1bRwMt5cnu+3yEgD2io_f8nRA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 22, 2013 at 2:29 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> The "barrier" essentially
> divides up the code into chunks and requires that those chunks be
> optimized independently by the compiler without knowledge of what
> earlier or later chunks are doing

While all this sounds sensible I would love to see a gcc programmer or
llvm programmer actually comment on what they think volatile does and
what they want to implement in the compiler.

I'm a bit worried that we're making assumptions like "things happen in
a specific order" that aren't really justified. In these days of
superscalar execution and multi-level caches things may be weirder
than we're imagining.

--
greg