WIP - Add ability to constrain backend temporary file space

Lists: pgsql-hackers
From: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: WIP - Add ability to constrain backend temporary file space
Date: 2011-02-18 03:17:44
Message-ID: 4D5DE4D8.30000@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Recently two systems here have suffered severely with excessive
temporary file creation during query execution. In one case it could
have been avoided by more stringent qa before application code release,
whereas the other is an ad-hoc system, and err...yes.

In both cases it would have been great to be able to constrain the
amount of temporary file space a query could use. In theory you can sort
of do this with the various ulimits, but it seems pretty impractical as
at that level all files look the same and you'd be just as likely to
unexpectedly crippled the entire db a few weeks later when a table grows...

I got to wonder how hard this would be to do in Postgres, and attached
is my (WIP) attempt. It provides a guc (max_temp_files_size) to limit
the size of all temp files for a backend and amends fd.c cancel
execution if the total size of temporary files exceeds this.

This is WIP, it does seem to work ok, but some areas/choices I'm not
entirely clear about are mentioned in the patch itself. Mainly:

- name of the guc... better suggestions welcome
- datatype for the guc - real would be good, but at the moment the nice
parse KB/MB/GB business only works for int

regards

Mark

Attachment Content-Type Size
temp-files-v1.patch.gz application/x-gzip 1.9 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP - Add ability to constrain backend temporary file space
Date: 2011-02-18 13:34:43
Message-ID: AANLkTinOG7PfvaP9wzXheULa7EKkUsfWjanb22xQ4=k7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 17, 2011 at 10:17 PM, Mark Kirkwood
<mark(dot)kirkwood(at)catalyst(dot)net(dot)nz> wrote:
> This is WIP, it does seem to work ok, but some areas/choices I'm not
> entirely clear about are mentioned in the patch itself. Mainly:
>
> - name of the guc... better suggestions welcome
> - datatype for the guc - real would be good, but at the moment the nice
> parse KB/MB/GB business only works for int

Please add this to the next CommitFest:

https://commitfest.postgresql.org/action/commitfest_view/open

With respect to the datatype of the GUC, int seems clearly correct.
Why would you want to use a float?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP - Add ability to constrain backend temporary file space
Date: 2011-02-18 19:41:27
Message-ID: 4D5ECB67.2020702@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark,

> I got to wonder how hard this would be to do in Postgres, and attached
> is my (WIP) attempt. It provides a guc (max_temp_files_size) to limit
> the size of all temp files for a backend and amends fd.c cancel
> execution if the total size of temporary files exceeds this.

First, are we just talking about pgsql_tmp here, or the pg_temp
tablespace? That is, just sort/hash files, or temporary tables as well?

Second, the main issue with these sorts of macro-counters has generally
been their locking effect on concurrent activity. Have you been able to
run any tests which try to run lots of small externally-sorted queries
at once on a multi-core machine, and checked the effect on throughput?

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP - Add ability to constrain backend temporary file space
Date: 2011-02-18 19:44:34
Message-ID: AANLkTimGC0-wH+Owwq=MwWQk9r+uQx60a+RGA5WP1Y25@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 18, 2011 at 2:41 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> Second, the main issue with these sorts of macro-counters has generally
> been their locking effect on concurrent activity.  Have you been able to
> run any tests which try to run lots of small externally-sorted queries
> at once on a multi-core machine, and checked the effect on throughput?

Since it's apparently a per-backend limit, that doesn't seem relevant.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP - Add ability to constrain backend temporary file space
Date: 2011-02-18 19:48:24
Message-ID: 4D5ECD08.5020706@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2/18/11 11:44 AM, Robert Haas wrote:
> On Fri, Feb 18, 2011 at 2:41 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> Second, the main issue with these sorts of macro-counters has generally
>> been their locking effect on concurrent activity. Have you been able to
>> run any tests which try to run lots of small externally-sorted queries
>> at once on a multi-core machine, and checked the effect on throughput?
>
> Since it's apparently a per-backend limit, that doesn't seem relevant.

Oh! I missed that.

What good would a per-backend limit do, though?

And what happens with queries which exceed the limit? Error message? Wait?

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP - Add ability to constrain backend temporary file space
Date: 2011-02-18 20:37:34
Message-ID: AANLkTi==xu-LRoHHMuPHAHfLU2r3W1ziKKPoycH+UY9s@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 18, 2011 at 2:48 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 2/18/11 11:44 AM, Robert Haas wrote:
>> On Fri, Feb 18, 2011 at 2:41 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>> Second, the main issue with these sorts of macro-counters has generally
>>> been their locking effect on concurrent activity.  Have you been able to
>>> run any tests which try to run lots of small externally-sorted queries
>>> at once on a multi-core machine, and checked the effect on throughput?
>>
>> Since it's apparently a per-backend limit, that doesn't seem relevant.
>
> Oh!  I missed that.
>
> What good would a per-backend limit do, though?
>
> And what happens with queries which exceed the limit?  Error message?  Wait?

Well I have not RTFP, but I assume it'd throw an error. Waiting isn't
going to accomplish anything.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP - Add ability to constrain backend temporary file space
Date: 2011-02-18 21:19:17
Message-ID: 4D5EE255.7010507@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 19/02/11 02:34, Robert Haas wrote:
>
> Please add this to the next CommitFest:
>
> https://commitfest.postgresql.org/action/commitfest_view/open
>
> With respect to the datatype of the GUC, int seems clearly correct.
> Why would you want to use a float?
>

Added. With respect to the datatype, using int with KB units means the
largest temp size is approx 2047GB - I know that seems like a lot now...
but maybe someone out there wants (say) their temp files limited to
4096GB :-)

Cheers

Mark


From: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP - Add ability to constrain backend temporary file space
Date: 2011-02-18 21:33:48
Message-ID: 4D5EE5BC.303@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 19/02/11 08:48, Josh Berkus wrote:
> On 2/18/11 11:44 AM, Robert Haas wrote:
>> On Fri, Feb 18, 2011 at 2:41 PM, Josh Berkus<josh(at)agliodbs(dot)com> wrote:
>>> Second, the main issue with these sorts of macro-counters has generally
>>> been their locking effect on concurrent activity. Have you been able to
>>> run any tests which try to run lots of small externally-sorted queries
>>> at once on a multi-core machine, and checked the effect on throughput?
>> Since it's apparently a per-backend limit, that doesn't seem relevant.
> Oh! I missed that.
>
> What good would a per-backend limit do, though?
>
> And what happens with queries which exceed the limit? Error message? Wait?
>
>

By "temp files" I mean those in pgsql_tmp. LOL - A backend limit will
have the same sort of usefulness as work_mem does - i.e stop a query
eating all your filesystem space or bringing a server to its knees with
io load. We have had this happen twice - I know of other folks who have too.

Obviously you need to do the same sort of arithmetic as you do with
work_mem to decide on a reasonable limit to cope with multiple users
creating temp files. Conservative dbas might want to set it to (free
disk)/max_connections etc. Obviously for ad-hoc systems it is a bit more
challenging - but having a per-backend limit is way better than having
what we have now, which is ... errr... nothing.

As an example I'd find it useful to avoid badly written queries causing
too much io load on the db backend of (say) a web system (i.e such a
system should not *have* queries that want to use that much resource).

To answer the other question, what happens when the limit is exceeded is
modeled on statement timeout, i.e query is canceled and a message says
why (exceeded temp files size).

Cheers

Mark


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP - Add ability to constrain backend temporary file space
Date: 2011-02-18 21:38:27
Message-ID: 4D5EE6D3.9010803@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> Obviously you need to do the same sort of arithmetic as you do with
> work_mem to decide on a reasonable limit to cope with multiple users
> creating temp files. Conservative dbas might want to set it to (free
> disk)/max_connections etc. Obviously for ad-hoc systems it is a bit more
> challenging - but having a per-backend limit is way better than having
> what we have now, which is ... errr... nothing.

Agreed.

> To answer the other question, what happens when the limit is exceeded is
> modeled on statement timeout, i.e query is canceled and a message says
> why (exceeded temp files size).

When does this happen? When you try to allocate the file, or when it
does the original tape sort estimate?

The disadvantage of the former is that the user waited for minutes in
order to have their query cancelled. The disadvantage of the latter is
that the estimate isn't remotely accurate.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP - Add ability to constrain backend temporary file space
Date: 2011-02-18 21:41:23
Message-ID: 16365.1298065283@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz> writes:
> Added. With respect to the datatype, using int with KB units means the
> largest temp size is approx 2047GB - I know that seems like a lot now...
> but maybe someone out there wants (say) their temp files limited to
> 4096GB :-)

[ shrug... ] Sorry, I can't imagine a use case for this parameter where
the value isn't a *lot* less than that. Maybe if it were global, but
not if it's per-backend.

regards, tom lane


From: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP - Add ability to constrain backend temporary file space
Date: 2011-02-18 21:50:39
Message-ID: 4D5EE9AF.10903@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 19/02/11 10:38, Josh Berkus wrote:
>
>> To answer the other question, what happens when the limit is exceeded is
>> modeled on statement timeout, i.e query is canceled and a message says
>> why (exceeded temp files size).
> When does this happen? When you try to allocate the file, or when it
> does the original tape sort estimate?
>
> The disadvantage of the former is that the user waited for minutes in
> order to have their query cancelled. The disadvantage of the latter is
> that the estimate isn't remotely accurate.
>

Neither - it checks each write (I think this is pretty cheap - adds two
int and double + operations and a /, > operation to FileWrite). If the
check shows you've written more than the limit, you get canceled. So you
can exceed the limit by 1 buffer size.

Yeah, the disadvantage is that (like statement timeout) it is a 'bottom
of the cliff' type of protection. The advantage is there are no false
positives...

Cheers

Mark


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP - Add ability to constrain backend temporary file space
Date: 2011-02-18 22:30:13
Message-ID: 4D5EF2F5.5070801@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> Yeah, the disadvantage is that (like statement timeout) it is a 'bottom
> of the cliff' type of protection. The advantage is there are no false
> positives...

Yeah, just trying to get a handle on the proposed feature. I have no
objections; it seems like a harmless limit for most people, and useful
to a few.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP - Add ability to constrain backend temporary file space
Date: 2011-02-18 22:40:59
Message-ID: 4D5EF57B.7020106@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 19/02/11 11:30, Josh Berkus wrote:
>> Yeah, the disadvantage is that (like statement timeout) it is a 'bottom
>> of the cliff' type of protection. The advantage is there are no false
>> positives...
> Yeah, just trying to get a handle on the proposed feature. I have no
> objections; it seems like a harmless limit for most people, and useful
> to a few.
>
No worries and sorry, I should have used the "per backend" phrase in the
title to help clarify what was intended.

Cheers

Mark


From: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP - Add ability to constrain backend temporary file space
Date: 2011-03-09 23:17:16
Message-ID: 4D780A7C.8060502@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

New version:

- adds documentation
- adds category RESOURCES_DISK

Attachment Content-Type Size
temp-files-v2.patch.gz application/x-gzip 2.6 KB