Re: pg_dump lock timeout

Lists: pgsql-hackerspgsql-patches
From: daveg <daveg(at)sonic(dot)net>
To: pgsql-patches(at)postgresql(dot)org
Cc: hari(at)efrontier(dot)com
Subject: pg_dump lock timeout
Date: 2008-05-11 11:30:47
Message-ID: 20080511113047.GV5673@sonic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Attached is a patch to add a commandline option to pg_dump to limit how long
pg_dump will wait for locks during startup.

The intent of this patch is to allow pg_dump to fail if a table lock cannot
be taken in a reasonable time. This allows the caller of pg_dump to retry or
otherwise correct the situation, without having locks held for long periods,
and without pg_dump having a long window during which catalog changes can
occur.

It works by setting statement_timeout to the user specified delay during
the startup phase where it is taking access share locks on all the tables.
Once all the locks are taken, it sets statement_timeout back to the default.
If a lock table statement times out, the dump fails with the statement timed
out error.

The orginal motivation was a client who runs heavy batch workloads and uses
truncate table and other DML in long transactions. This has created some
unhappy interaction scenarios with pg_dump:

- pg_dump ends up waiting hours on a DML table lock that is part of a long
transaction. Once the lock is released, pg_dump runs only to find
some table later in the list has been dropped. So pg_dump fails.

- pg_dump waits on a lock while holding access share locks on most of the
tables. Other processes that want to do DML wait on pg_dump. After a
while, large parts of the application are blocked while pg_dump waits
on locks. Eventually the operations staff notice that pg_dump is
blocking production and kill the dump.

Please have a look and consider it for merging.

Thanks

-dg

--
David Gould
If simplicity worked, the world would be overrun with insects.


From: daveg <daveg(at)sonic(dot)net>
To: pgsql-patches(at)postgresql(dot)org
Cc: hari(at)efrontier(dot)com
Subject: Re: pg_dump lock timeout
Date: 2008-05-11 13:00:35
Message-ID: 20080511130035.GW5673@sonic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Sun, May 11, 2008 at 04:30:47AM -0700, daveg wrote:
>
> Attached is a patch to add a commandline option to pg_dump to limit how long
> pg_dump will wait for locks during startup.

Ooops, really attached this time.

-dg

--
David Gould daveg(at)sonic(dot)net 510 536 1443 510 282 0869
If simplicity worked, the world would be overrun with insects.

Attachment Content-Type Size
pg_dump.c.timeout_patch text/plain 3.0 KB
pg_dump.sgml.timeout_patch text/plain 1.1 KB

From: David Fetter <david(at)fetter(dot)org>
To: daveg <daveg(at)sonic(dot)net>
Cc: pgsql-patches(at)postgresql(dot)org, hari(at)efrontier(dot)com
Subject: Re: pg_dump lock timeout
Date: 2008-07-02 15:56:26
Message-ID: 20080702155626.GC30328@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Sun, May 11, 2008 at 06:00:35AM -0700, David Gould wrote:
> On Sun, May 11, 2008 at 04:30:47AM -0700, daveg wrote:
> >
> > Attached is a patch to add a commandline option to pg_dump to
> > limit how long pg_dump will wait for locks during startup.
>
> Ooops, really attached this time.

Can we see about getting this into the July commitfest? Dave has
presented a use case complete with logs where having this could have
prevented a failed backup and consequent data loss.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: "Ken Camann" <kjcamann(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Working on native Windows x64 version of PostgreSQL
Date: 2008-07-02 21:41:07
Message-ID: 63c05a820807021441o4658171cm91d4984f3981ecef@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hi everyone.

For those who have seen my other thread, I have decided to take the
plunge: I am going to try to get PostgreSQL to compile as a native
Windows x64 application (so that it will be able to interface with x64
DLLs) and I will contribute my changes to the community.

As you know, many of the extra postgres features rely on external
libraries available from gnuwin32, so these must also be converted to
x64 DLLs or they won't work. I will be starting small, disabling all
these extra features and only trying to build the core functionality.

I plan to work on this in a few stages, submitted as work in progress
patches, because I think this will require a lot of changes. The
first stage is very simple though; I will be modifying the Perl
scripts in the src/tools/msvc directory to support the x64 compiler.
I am using Microsoft Visual C 2008, but I will try to ensure
everything works with Visual C 2005 (no previous versions have an x64
compiler, so there are only 2 platforms to support).

While this is the simplest change, it may need the most review. I am
a good C programmer but I barely know Perl. That has its advantages;
it will ensure I stick to the style currently used since I have no
other habits.

-Ken


From: "Dave Page" <dpage(at)pgadmin(dot)org>
To: "Ken Camann" <kjcamann(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Working on native Windows x64 version of PostgreSQL
Date: 2008-07-02 21:50:31
Message-ID: 937d27e10807021450m4df0c366t870d6678a1abf61f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Wed, Jul 2, 2008 at 10:41 PM, Ken Camann <kjcamann(at)gmail(dot)com> wrote:
> Hi everyone.
>
> For those who have seen my other thread, I have decided to take the
> plunge: I am going to try to get PostgreSQL to compile as a native
> Windows x64 application (so that it will be able to interface with x64
> DLLs) and I will contribute my changes to the community.

:-)

> As you know, many of the extra postgres features rely on external
> libraries available from gnuwin32, so these must also be converted to
> x64 DLLs or they won't work. I will be starting small, disabling all
> these extra features and only trying to build the core functionality.

That's perfectly reasonable.

> I plan to work on this in a few stages, submitted as work in progress
> patches, because I think this will require a lot of changes. The
> first stage is very simple though; I will be modifying the Perl
> scripts in the src/tools/msvc directory to support the x64 compiler.
> I am using Microsoft Visual C 2008, but I will try to ensure
> everything works with Visual C 2005 (no previous versions have an x64
> compiler, so there are only 2 platforms to support).

I would suggest generating just VC2k5 project files, and then
modifying the build scripts to upgrade them first if required. We do
something similar with wxWidgets for pgAdmin's use - albeit converting
VC++6 files to 2k5 - using the /upgrade option in vcbuild.exe. I
assume something similar is feasible for 2k5 -> 2k8.

Regards, Dave

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: daveg <daveg(at)sonic(dot)net>
Subject: Re: pg_dump lock timeout
Date: 2008-07-03 01:33:46
Message-ID: 20080703013346.GP31154@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Dave,

Just a few comments regarding your pg_dump lock timeout patch (in
general I like the concept and agree with adding it):

- No validity checking that the argument passed in has anything to do
with a number. The backend will do this, but it strikes me as a bit
odd to not do any checking at argument processing time.

- You call the argument 'wait time' in the documentation, but 'DELAY'
in the command-line help. I'd recommend using one term and sticking
to it. You're already two lines in the command-line help, you can
spell it out as 'WAIT_TIME' or similar.

- getTables() uses different variables for each query, and I'm
inclined to agree with that approach to make following the code
easier. I'd encourage you to add a new variable for the
statement_timeout query rather than reusing the lockqry variable.
You could even offset this by removing the unused delqry variable.

Otherwise, looks good to me.

Thanks,

Stephen


From: "Marko Kreen" <markokr(at)gmail(dot)com>
To: daveg <daveg(at)sonic(dot)net>
Cc: pgsql-patches(at)postgresql(dot)org, hari(at)efrontier(dot)com, "Postgres Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] pg_dump lock timeout
Date: 2008-07-03 08:15:10
Message-ID: e51f66da0807030115y2520ecbdie9d17f6312636f3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On 5/11/08, daveg <daveg(at)sonic(dot)net> wrote:
> Attached is a patch to add a commandline option to pg_dump to limit how long
> pg_dump will wait for locks during startup.

My quick review:

- It does not seem important enough to waste a short option on.
Having only long option should be enough.

- It would be more polite to do SET LOCAL instead SET.
(Eg. it makes safer to use pg_dump through pooler.)

- The statement_timeout is set back with "statement_timeout = default"
Maybe it would be better to do "= 0" here? Although such decision
would go outside the scope of the patch, I see no sense having
any other statement_timeout for actual dumping.

--
marko


From: daveg <daveg(at)sonic(dot)net>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org, hari(at)efrontier(dot)com, Postgres Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] pg_dump lock timeout
Date: 2008-07-03 12:55:01
Message-ID: 20080703125501.GS866@sonic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Thu, Jul 03, 2008 at 11:15:10AM +0300, Marko Kreen wrote:
> On 5/11/08, daveg <daveg(at)sonic(dot)net> wrote:
> > Attached is a patch to add a commandline option to pg_dump to limit how long
> > pg_dump will wait for locks during startup.
>
> My quick review:
>
> - It does not seem important enough to waste a short option on.
> Having only long option should be enough.

Agreed. I'll change it.

> - It would be more polite to do SET LOCAL instead SET.
> (Eg. it makes safer to use pg_dump through pooler.)

Also agreed. Thanks.

> - The statement_timeout is set back with "statement_timeout = default"
> Maybe it would be better to do "= 0" here? Although such decision
> would go outside the scope of the patch, I see no sense having
> any other statement_timeout for actual dumping.

I'd prefer to leave whatever policy is otherwise in place alone. I can see
use cases for either having or not having a timeout for pg_dump, but it does
seem outside the scope of this patch.

Thanks for you review and comments.

-dg

--
David Gould daveg(at)sonic(dot)net 510 536 1443 510 282 0869
If simplicity worked, the world would be overrun with insects.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: daveg <daveg(at)sonic(dot)net>
Cc: Marko Kreen <markokr(at)gmail(dot)com>, pgsql-patches(at)postgresql(dot)org, hari(at)efrontier(dot)com, Postgres Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] pg_dump lock timeout
Date: 2008-07-03 15:33:16
Message-ID: 3001.1215099196@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

daveg <daveg(at)sonic(dot)net> writes:
> On Thu, Jul 03, 2008 at 11:15:10AM +0300, Marko Kreen wrote:
>> - The statement_timeout is set back with "statement_timeout = default"
>> Maybe it would be better to do "= 0" here? Although such decision
>> would go outside the scope of the patch, I see no sense having
>> any other statement_timeout for actual dumping.

> I'd prefer to leave whatever policy is otherwise in place alone.

The policy in place in CVS HEAD is that pg_dump explicitly sets
statement_timeout to 0. Setting it to default would break that,
and will certainly not be accepted.

regards, tom lane


From: daveg <daveg(at)sonic(dot)net>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org, hari(at)efrontier(dot)com, Postgres Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] pg_dump lock timeout
Date: 2008-07-16 12:37:36
Message-ID: 20080716123736.GJ11471@sonic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Thu, Jul 03, 2008 at 05:55:01AM -0700, daveg wrote:
> On Thu, Jul 03, 2008 at 11:15:10AM +0300, Marko Kreen wrote:
> > On 5/11/08, daveg <daveg(at)sonic(dot)net> wrote:
> > > Attached is a patch to add a commandline option to pg_dump to limit how long
> > > pg_dump will wait for locks during startup.
> >
> > My quick review:
> >
> > - It does not seem important enough to waste a short option on.
> > Having only long option should be enough.
>
> Agreed. I'll change it.
>
> > - It would be more polite to do SET LOCAL instead SET.
> > (Eg. it makes safer to use pg_dump through pooler.)
>
> Also agreed. Thanks.

On second glance, pg_dump sets lots of variables without using SET LOCAL.
I think fixing that must be the subject of a separate patch as fixing just
one of many will only cause confusion.

> > - The statement_timeout is set back with "statement_timeout = default"
> > Maybe it would be better to do "= 0" here? Although such decision
> > would go outside the scope of the patch, I see no sense having
> > any other statement_timeout for actual dumping.
>
> I'd prefer to leave whatever policy is otherwise in place alone. I can see
> use cases for either having or not having a timeout for pg_dump, but it does
> seem outside the scope of this patch.

As it happens, another patch has set the policy to "statement_timeout = 0",
so I will follow that.

I'm sending in the revised patch today.

-dg

--
David Gould daveg(at)sonic(dot)net 510 536 1443 510 282 0869
If simplicity worked, the world would be overrun with insects.


From: daveg <daveg(at)sonic(dot)net>
To: pgsql-patches(at)postgresql(dot)org
Cc: daveg(at)sonic(dot)net, hari(at)efrontier(dot)com
Subject: Re: pg_dump lock timeout
Date: 2008-07-17 02:44:07
Message-ID: 20080717024407.GL11471@sonic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Here is an updated version of this patch against head. It builds, runs and
functions as expected. I did not build the sgml.

I've made changes based on various comments as follows:

- use WAIT_TIME in description consistantly. Reworded for clarity.
(Stephan Frost)

- Use a separate query buffer in getTables() (Stephan Frost)

- sets statement_timeout=0 afterwards per new policy (Tom Lane, Marko Kreen)

- has only --long-option to conserve short option letters (Marko Kreen)

Regards

-dg

--
David Gould daveg(at)sonic(dot)net 510 536 1443 510 282 0869
If simplicity worked, the world would be overrun with insects.

Attachment Content-Type Size
pg-dump-timeout.patch text/plain 5.1 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: daveg <daveg(at)sonic(dot)net>
Cc: pgsql-patches(at)postgresql(dot)org, hari(at)efrontier(dot)com
Subject: Re: pg_dump lock timeout
Date: 2008-07-20 18:50:50
Message-ID: 13580.1216579850@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

daveg <daveg(at)sonic(dot)net> writes:
> Here is an updated version of this patch against head. It builds, runs and
> functions as expected. I did not build the sgml.

Applied with mostly minor cosmetic improvements --- the only actual
error I noticed was failing to check whether the server version supports
statement_timeout.

In most cases our policy has been that pg_dumpall should accept and pass
through any pg_dump option for which it's sensible to do so. I did not
make that happen but it seems it'd be a reasonable follow-on patch.

A minor point is that the syntax "-X lock-wait-timeout=n" or
"-X lock-wait-timeout n" won't work, although perhaps people used to
-X might expect it to. Since we deprecate -X (and don't even document
it anymore), I thought that making this work would be much more trouble
than it's worth, but perhaps that's open to argument.

regards, tom lane


From: daveg <daveg(at)sonic(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org, hari(at)efrontier(dot)com
Subject: Re: pg_dump lock timeout
Date: 2008-07-21 07:21:31
Message-ID: 20080721072131.GE30869@sonic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Sun, Jul 20, 2008 at 02:50:50PM -0400, Tom Lane wrote:
> daveg <daveg(at)sonic(dot)net> writes:
> > Here is an updated version of this patch against head. It builds, runs and
> > functions as expected. I did not build the sgml.
>
> Applied with mostly minor cosmetic improvements --- the only actual
> error I noticed was failing to check whether the server version supports
> statement_timeout.

I chose not to test backend version on the grounds that getting an explicit
failure for an explicitly requested option would be preferable to it being
silently ignored. However if the user is trying to use the same scripts for
many versions then ignoring unsupported but unessential features may be
preferred.

One of the cosmetic changes made in response to other reviewers was to
not reuse lockquery, instead to have a separate query buffer. You have
reversed that and eliminated lockquery too. Which seems better.

> In most cases our policy has been that pg_dumpall should accept and pass
> through any pg_dump option for which it's sensible to do so. I did not
> make that happen but it seems it'd be a reasonable follow-on patch.

I'll remember that next time.

> A minor point is that the syntax "-X lock-wait-timeout=n" or
> "-X lock-wait-timeout n" won't work, although perhaps people used to
> -X might expect it to. Since we deprecate -X (and don't even document
> it anymore), I thought that making this work would be much more trouble
> than it's worth, but perhaps that's open to argument.

All the other -X options are flags and supported directly by the getopt
machinery. Adding a -X option with an argument would require parsing
the argument by hand including dealing with '=' or ' ' as the separator and
telling getopt you had eaten an extra argument. This seemed a bit too much
code for the value of supporting a deprecated format for a brand new option.

Finally, you changed the option value and case from 1 to case 2. getopt_long()
only returns the value argument when there is no flag to set, so 1 was unique
and could have been read as "the first no-short option with an argument".

Thanks for checking this in.

-dg

--
David Gould daveg(at)sonic(dot)net 510 536 1443 510 282 0869
If simplicity worked, the world would be overrun with insects.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: daveg <daveg(at)sonic(dot)net>
Cc: pgsql-patches(at)postgresql(dot)org, hari(at)efrontier(dot)com
Subject: Re: pg_dump lock timeout
Date: 2008-07-21 07:43:11
Message-ID: 3279.1216626191@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

daveg <daveg(at)sonic(dot)net> writes:
> On Sun, Jul 20, 2008 at 02:50:50PM -0400, Tom Lane wrote:
>> In most cases our policy has been that pg_dumpall should accept and pass
>> through any pg_dump option for which it's sensible to do so. I did not
>> make that happen but it seems it'd be a reasonable follow-on patch.

> I'll remember that next time.

Er .. actually that was a direct request for you to do it.

> Finally, you changed the option value and case from 1 to case 2. getopt_long
> only returns the value argument when there is no flag to set, so 1 was unique
> and could have been read as "the first no-short option with an argument".

Yeah. The code *worked* as you submitted it, but what was bothering me
was that the "val = 1" table entries worked in two completely different
ways for the different argument types. I first thought that you'd
broken the existing long argument options --- you hadn't, but I had to
go re-read the getopt_long source to convince myself of that. So it
seemed like using a different "val" value might help clarify the
difference in behavior for future readers of the code. In particular
the next guy who wants to add a long option with parameter value would
certainly not be able to use val = 1; but I thought the code as you
had it wouldn't give him any clue what to do. If you've got a better
idea about how to deal with that, feel free...

regards, tom lane


From: daveg <daveg(at)sonic(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org, hari(at)efrontier(dot)com
Subject: Re: pg_dump lock timeout
Date: 2008-07-22 05:57:35
Message-ID: 20080722055735.GW30869@sonic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Mon, Jul 21, 2008 at 03:43:11AM -0400, Tom Lane wrote:
> daveg <daveg(at)sonic(dot)net> writes:
> > On Sun, Jul 20, 2008 at 02:50:50PM -0400, Tom Lane wrote:
> >> In most cases our policy has been that pg_dumpall should accept and pass
> >> through any pg_dump option for which it's sensible to do so. I did not
> >> make that happen but it seems it'd be a reasonable follow-on patch.
>
> > I'll remember that next time.
>
> Er .. actually that was a direct request for you to do it.


Attached is a the followon patch for pg_dumpall and docs to match pg_dump.

On a second topic, is anyone working on a parallel dump/load? I'd be
interested in helping.

-dg

--
David Gould daveg(at)sonic(dot)net 510 536 1443 510 282 0869
If simplicity worked, the world would be overrun with insects.

Attachment Content-Type Size
pg-dumpall-timeout.patch text/plain 2.4 KB

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: daveg <daveg(at)sonic(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, hari(at)efrontier(dot)com
Subject: Re: [PATCHES] pg_dump lock timeout
Date: 2008-08-29 17:32:07
Message-ID: 20080829173207.GK3983@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

daveg wrote:

> Attached is a the followon patch for pg_dumpall and docs to match pg_dump.

Thanks, committed. (The only change I did was to correct the alignment
in help output.)

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support