Proposal "VACUUM SCHEMA"

Lists: pgsql-hackers
From: Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Proposal "VACUUM SCHEMA"
Date: 2014-12-21 16:48:16
Message-ID: CAFcNs+pHJLmLM9n=B6aofV3XPpwHq+znnEjV9UkvPNSZyegXow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi all,

I work with some customer that have databases with a lot of schemas and
sometimes we need to run manual VACUUM in one schema, and would be nice to
have a new option to run vacuum in relations from a specific schema.

The new syntax could be:

VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] { [ table_name ] | SCHEMA
schema_name }

Also I'll add a new option to "vacuumdb" client:

-S, --schema=SCHEMA

I can work on this feature to 2015/02 CF.

Thoughts?

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: fabriziomello(at)gmail(dot)com
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-21 19:18:33
Message-ID: 3418.1419189513@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= <fabriziomello(at)gmail(dot)com> writes:
> I work with some customer that have databases with a lot of schemas and
> sometimes we need to run manual VACUUM in one schema, and would be nice to
> have a new option to run vacuum in relations from a specific schema.

I'm pretty skeptical of this alleged use-case. Manual vacuuming ought
to be mostly a thing of the past, and even if it's not, hitting
*everything* in a schema should seldom be an appropriate thing to do.

While the feature itself might be fairly innocuous, I'm just wondering
why we need to encourage manual vacuuming. And why that, but not
say schema-wide ANALYZE, CLUSTER, TRUNCATE, ...

regards, tom lane


From: Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-21 21:30:35
Message-ID: CAFcNs+rKYqKdkF6cuvKM4e3UBKGGmmzVJxOFy=s=tP4Aw46-+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Dec 21, 2014 at 5:18 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> =?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= <fabriziomello(at)gmail(dot)com> writes:
> > I work with some customer that have databases with a lot of schemas and
> > sometimes we need to run manual VACUUM in one schema, and would be nice
to
> > have a new option to run vacuum in relations from a specific schema.
>
> I'm pretty skeptical of this alleged use-case. Manual vacuuming ought
> to be mostly a thing of the past, and even if it's not, hitting
> *everything* in a schema should seldom be an appropriate thing to do.
>

I agree manual vacuum is a thing of the past, but autovacuum doesn't solve
100% of the cases, and sometimes we need to use it so my proposal is just
do help DBAs and/or Sysadmins to write simple maintenance scripts.

> While the feature itself might be fairly innocuous, I'm just wondering
> why we need to encourage manual vacuuming.

IMHO we will not encourage manual vacuuming, just give more flexibility to
users.

> And why that, but not
> say schema-wide ANALYZE, CLUSTER, TRUNCATE, ...
>

+1. I can write patches for each of this maintenance statement too.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: <fabriziomello(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-22 02:32:32
Message-ID: 549782C0.7050607@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/21/14, 3:30 PM, Fabrízio de Royes Mello wrote:
>
> On Sun, Dec 21, 2014 at 5:18 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>> wrote:
> >
> > =?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= <fabriziomello(at)gmail(dot)com <mailto:fabriziomello(at)gmail(dot)com>> writes:
> > > I work with some customer that have databases with a lot of schemas and
> > > sometimes we need to run manual VACUUM in one schema, and would be nice to
> > > have a new option to run vacuum in relations from a specific schema.
> >
> > I'm pretty skeptical of this alleged use-case. Manual vacuuming ought
> > to be mostly a thing of the past, and even if it's not, hitting
> > *everything* in a schema should seldom be an appropriate thing to do.
> >
>
> I agree manual vacuum is a thing of the past, but autovacuum doesn't solve 100% of the cases, and sometimes we need to use it so my proposal is just do help DBAs and/or Sysadmins to write simple maintenance scripts.

Just one example of that is pre-emptively vacuuming during slower periods. Nothing spells "fun" like a freeze vacuum in the middle of a busy lunch period for a website.

Similarly, it's common to need to proactively vacuum after a data load, and since it's not unusual for there to be a schema dedicated to loading data, this makes that easier.

> > And why that, but not
> > say schema-wide ANALYZE, CLUSTER, TRUNCATE, ...
> >
>
> +1. I can write patches for each of this maintenance statement too.

If we're going to go that route, then perhaps it would make more sense to create a command that allows you to apply a second command to every object in a schema. We would have to be careful about PreventTransactionChain commands.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-22 02:55:14
Message-ID: CAFcNs+qR53xD+1J6yn+ZrFpPHEiGnz47N2kJSX_wQUwv3aattw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Em segunda-feira, 22 de dezembro de 2014, Jim Nasby <
Jim(dot)Nasby(at)bluetreble(dot)com> escreveu:

> On 12/21/14, 3:30 PM, Fabrízio de Royes Mello wrote:
>
>>
>> On Sun, Dec 21, 2014 at 5:18 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us <mailto:
>> tgl(at)sss(dot)pgh(dot)pa(dot)us>> wrote:
>> >
>> > =?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= <fabriziomello(at)gmail(dot)com
>> <mailto:fabriziomello(at)gmail(dot)com>> writes:
>> > > I work with some customer that have databases with a lot of schemas
>> and
>> > > sometimes we need to run manual VACUUM in one schema, and would be
>> nice to
>> > > have a new option to run vacuum in relations from a specific schema.
>> >
>> > I'm pretty skeptical of this alleged use-case. Manual vacuuming ought
>> > to be mostly a thing of the past, and even if it's not, hitting
>> > *everything* in a schema should seldom be an appropriate thing to do.
>> >
>>
>> I agree manual vacuum is a thing of the past, but autovacuum doesn't
>> solve 100% of the cases, and sometimes we need to use it so my proposal is
>> just do help DBAs and/or Sysadmins to write simple maintenance scripts.
>>
>
> Just one example of that is pre-emptively vacuuming during slower periods.
> Nothing spells "fun" like a freeze vacuum in the middle of a busy lunch
> period for a website.
>
> Similarly, it's common to need to proactively vacuum after a data load,
> and since it's not unusual for there to be a schema dedicated to loading
> data, this makes that easier.

Good example. Thanks.

>
> > And why that, but not
>> > say schema-wide ANALYZE, CLUSTER, TRUNCATE, ...
>> >
>>
>> +1. I can write patches for each of this maintenance statement too.
>>
>
> If we're going to go that route, then perhaps it would make more sense to
> create a command that allows you to apply a second command to every object
> in a schema. We would have to be careful about PreventTransactionChain
> commands.

Sorry but I don't understand what you meant. Can you explain more about
your idea?

Regards,

Fabrízio Mello

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


From: José Luis Tallón <jltallon(at)adv-solutions(dot)net>
To: fabriziomello(at)gmail(dot)com
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-22 15:55:50
Message-ID: 54983F06.9090800@adv-solutions.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/21/2014 10:30 PM, Fabrízio de Royes Mello wrote:
> [snip]

I do agree that "vacuum schema" might very well be useful (I'll probably
use it myself from time to time, too).
ANALYZE SCHEMA (specially coupled with some transaction-wide "SET
statistics_target" could be beneficial)

>
> > And why that, but not
> > say schema-wide ANALYZE, CLUSTER, TRUNCATE, ...
> >
>
> +1. I can write patches for each of this maintenance statement too.

Hmm... I think Tom might have been a bit rethorical (or even sarcastic
with that), but I can definitely be wrong.

Do we really want to have some such operation potentially (and
inadvertently) locking for *hours* at a time?

CLUSTER SCHEMA somename;

... where schema "somename" contains "myHugeTable"

Given that the cluster command exclusively locks and rewrites the
table, it might lock queries and overwhelm the I/O subsystem for quite a
long time.

TRUNCATE SCHEMA whatever sounds quite dangerous, too.

Just my .02€

/ J.L.


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: fabriziomello(at)gmail(dot)com, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-22 16:05:49
Message-ID: 20141222160549.GB32020@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-12-21 14:18:33 -0500, Tom Lane wrote:
> =?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= <fabriziomello(at)gmail(dot)com> writes:
> > I work with some customer that have databases with a lot of schemas and
> > sometimes we need to run manual VACUUM in one schema, and would be nice to
> > have a new option to run vacuum in relations from a specific schema.
>
> I'm pretty skeptical of this alleged use-case. Manual vacuuming ought
> to be mostly a thing of the past, and even if it's not, hitting
> *everything* in a schema should seldom be an appropriate thing to do.

Based on my experience autovacuum isn't sufficient on bigger high
throughput databases. At the very least manual vacuuming with lower
freeze_table_age settings during low-load times is required lest
anti-wraparound vacuums increase load too much during prime business
hours.
That said, I don't see how this feature is actually helpful in those
cases. In pretty much all of what I've seen you'd want to have more
complex selection criteria than the schema.

> While the feature itself might be fairly innocuous, I'm just wondering
> why we need to encourage manual vacuuming. And why that, but not
> say schema-wide ANALYZE, CLUSTER, TRUNCATE, ...

There's one argument for supporting more for VACUUM than the rest - it
can't be executed directly as the result of a query as the others
can... I wonder if that'd not better be answered by adding a feature to
vacuumdb that allows selecting the to-be-vacuumed table by a user
defined query.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: José Luis Tallón <jltallon(at)adv-solutions(dot)net>
Cc: fabriziomello(at)gmail(dot)com, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-22 16:51:57
Message-ID: 20141222165157.GD1768@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

José Luis Tallón wrote:
> On 12/21/2014 10:30 PM, Fabrízio de Royes Mello wrote:
> >[snip]
>
> I do agree that "vacuum schema" might very well be useful (I'll probably use
> it myself from time to time, too).
> ANALYZE SCHEMA (specially coupled with some transaction-wide "SET
> statistics_target" could be beneficial)

We already have transanction-wide SET -- it's spelled SET LOCAL.

> >
> >> And why that, but not say schema-wide ANALYZE, CLUSTER, TRUNCATE,
> >> ...
> >
> >+1. I can write patches for each of this maintenance statement too.
>
> Hmm... I think Tom might have been a bit rethorical (or even sarcastic with
> that),

That was my impression too.

> Do we really want to have some such operation potentially (and
> inadvertently) locking for *hours* at a time?
>
> CLUSTER SCHEMA somename;
>
> ... where schema "somename" contains "myHugeTable"
>
> Given that the cluster command exclusively locks and rewrites the table,
> it might lock queries and overwhelm the I/O subsystem for quite a long time.

Multi-table CLUSTER uses multiple transactions, so this should not be an
issue. That said, I don't think there's much point in CLUSTER SCHEMA,
much less TRUNCATE SCHEMA. Do you normally organize your schemas so
that there are some that contain only tables that need to be truncated
together? That would be a strange use case.

Overall, this whole line of development seems like bloating the parse
tables for little gain.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Christoph Berg <cb(at)df7cb(dot)de>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: José Luis Tallón <jltallon(at)adv-solutions(dot)net>, fabriziomello(at)gmail(dot)com, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-22 16:58:49
Message-ID: 20141222165836.GA11110@msg.df7cb.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Re: Alvaro Herrera 2014-12-22 <20141222165157(dot)GD1768(at)alvh(dot)no-ip(dot)org>
> Multi-table CLUSTER uses multiple transactions, so this should not be an
> issue. That said, I don't think there's much point in CLUSTER SCHEMA,
> much less TRUNCATE SCHEMA. Do you normally organize your schemas so
> that there are some that contain only tables that need to be truncated
> together? That would be a strange use case.

Having a schema that's only used for importing data in batch jobs
doesn't sound too unreasonable. It could then be cleaned in a simple
"TRUNCATE SCHEMA import_area" command.

> Overall, this whole line of development seems like bloating the parse
> tables for little gain.

Reading the thread, my impression was that most people opposed the
idea because there's ways to script "vacuum schema", or because of
"people shouldn't be invoking manual vacuums anyway". I think the
patch tries to solve a practical problem, and does have its merits.

Christoph
--
cb(at)df7cb(dot)de | http://www.df7cb.de/


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: José Luis Tallón <jltallon(at)adv-solutions(dot)net>, fabriziomello(at)gmail(dot)com, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-22 17:05:42
Message-ID: 20141222170542.GF3062@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Alvaro Herrera (alvherre(at)2ndquadrant(dot)com) wrote:
> Multi-table CLUSTER uses multiple transactions, so this should not be an
> issue. That said, I don't think there's much point in CLUSTER SCHEMA,
> much less TRUNCATE SCHEMA. Do you normally organize your schemas so
> that there are some that contain only tables that need to be truncated
> together? That would be a strange use case.

I could see it happening in environments which use schemas when doing
partitioning. eg: data_2014 contains all of the data_201401-201412
monthly (or perhaps weekly) tables.

> Overall, this whole line of development seems like bloating the parse
> tables for little gain.

Still, I see this point also. I do think it'd be really great if we
could figure out a way to segregate these kinds of DDL / maintenance
commands from the normal select/insert/update/delete SQL parsing, such
that we could add more options, etc, to those longer running and less
frequent commands without impacting parse time for the high-volume
commands.

I'm less concerned about the memory impact, except to the extent that it
impacts throughput and performance.

Thanks,

Stephen


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: José Luis Tallón <jltallon(at)adv-solutions(dot)net>, fabriziomello(at)gmail(dot)com, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-22 17:11:07
Message-ID: 20141222171107.GE1768@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost wrote:
> * Alvaro Herrera (alvherre(at)2ndquadrant(dot)com) wrote:

> > Overall, this whole line of development seems like bloating the parse
> > tables for little gain.
>
> Still, I see this point also. I do think it'd be really great if we
> could figure out a way to segregate these kinds of DDL / maintenance
> commands from the normal select/insert/update/delete SQL parsing, such
> that we could add more options, etc, to those longer running and less
> frequent commands without impacting parse time for the high-volume
> commands.

We do have a parenthesized options clause in VACUUM. I think adding
this as a clause there would be pretty much free.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, fabriziomello(at)gmail(dot)com, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-22 17:12:12
Message-ID: 20141222171212.GG3062@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Andres Freund (andres(at)2ndquadrant(dot)com) wrote:
> On 2014-12-21 14:18:33 -0500, Tom Lane wrote:
> > While the feature itself might be fairly innocuous, I'm just wondering
> > why we need to encourage manual vacuuming. And why that, but not
> > say schema-wide ANALYZE, CLUSTER, TRUNCATE, ...
>
> There's one argument for supporting more for VACUUM than the rest - it
> can't be executed directly as the result of a query as the others
> can... I wonder if that'd not better be answered by adding a feature to
> vacuumdb that allows selecting the to-be-vacuumed table by a user
> defined query.

Wow. That's certainly an interesting idea.

We might end up turning the autovacuum process into a generalized
scheduler/cron-like entity that way though. I'd rather we just build
that. Users would then be able to run a script periodically which
would add VACUUM commands to be run on whichever tables they want to
the jobs queue, either for immediate execution or at whatever time they
want (or possibly chronically :).

Thanks!

Stephen


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, fabriziomello(at)gmail(dot)com, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-22 17:15:51
Message-ID: 20141222171551.GC32020@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-12-22 12:12:12 -0500, Stephen Frost wrote:
> * Andres Freund (andres(at)2ndquadrant(dot)com) wrote:
> > On 2014-12-21 14:18:33 -0500, Tom Lane wrote:
> > > While the feature itself might be fairly innocuous, I'm just wondering
> > > why we need to encourage manual vacuuming. And why that, but not
> > > say schema-wide ANALYZE, CLUSTER, TRUNCATE, ...
> >
> > There's one argument for supporting more for VACUUM than the rest - it
> > can't be executed directly as the result of a query as the others
> > can... I wonder if that'd not better be answered by adding a feature to
> > vacuumdb that allows selecting the to-be-vacuumed table by a user
> > defined query.
>
> Wow. That's certainly an interesting idea.
>
> We might end up turning the autovacuum process into a generalized
> scheduler/cron-like entity that way though.

I'm not talking about autovacuum, just plain vacuumdb.

> I'd rather we just build
> that. Users would then be able to run a script periodically which
> would add VACUUM commands to be run on whichever tables they want to
> the jobs queue, either for immediate execution or at whatever time they
> want (or possibly chronically :).

And this discussion just feature creeped beyond anything realistic... :)

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, fabriziomello(at)gmail(dot)com, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-22 17:17:15
Message-ID: 20141222171715.GF1768@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost wrote:
> * Andres Freund (andres(at)2ndquadrant(dot)com) wrote:

> > There's one argument for supporting more for VACUUM than the rest - it
> > can't be executed directly as the result of a query as the others
> > can... I wonder if that'd not better be answered by adding a feature to
> > vacuumdb that allows selecting the to-be-vacuumed table by a user
> > defined query.
>
> Wow. That's certainly an interesting idea.

+1.

> We might end up turning the autovacuum process into a generalized
> scheduler/cron-like entity that way though. I'd rather we just build
> that. Users would then be able to run a script periodically which
> would add VACUUM commands to be run on whichever tables they want to
> the jobs queue, either for immediate execution or at whatever time they
> want (or possibly chronically :).

This too. I think there's one or two orders of magnitude of difference
in implementation effort of these two ideas, however.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, fabriziomello(at)gmail(dot)com, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-22 17:17:33
Message-ID: 20141222171733.GH3062@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Andres Freund (andres(at)2ndquadrant(dot)com) wrote:
> On 2014-12-22 12:12:12 -0500, Stephen Frost wrote:
> > We might end up turning the autovacuum process into a generalized
> > scheduler/cron-like entity that way though.
>
> I'm not talking about autovacuum, just plain vacuumdb.

Oh, right, clearly I was thinking of autovacuum. Adding an option like
that to vacuumdb would certainly be a lot more straight-forward.

> > I'd rather we just build
> > that. Users would then be able to run a script periodically which
> > would add VACUUM commands to be run on whichever tables they want to
> > the jobs queue, either for immediate execution or at whatever time they
> > want (or possibly chronically :).
>
> And this discussion just feature creeped beyond anything realistic... :)

Yeah, but I really *want* this... ;)

Thanks!

Stephen


From: Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-22 17:23:07
Message-ID: CAFcNs+p=pNGcT+MT4DVCL3b=xjH++RaQrZPKw8OW0LK86tbX0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 22, 2014 at 3:17 PM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
wrote:
>
> Stephen Frost wrote:
> > * Andres Freund (andres(at)2ndquadrant(dot)com) wrote:
>
> > > There's one argument for supporting more for VACUUM than the rest - it
> > > can't be executed directly as the result of a query as the others
> > > can... I wonder if that'd not better be answered by adding a feature
to
> > > vacuumdb that allows selecting the to-be-vacuumed table by a user
> > > defined query.
> >
> > Wow. That's certainly an interesting idea.
>
> +1.
>

Then to simplify can we allow the "--table" option of vacuumdb act similar
to the "--table" option of pg_dump??

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, fabriziomello(at)gmail(dot)com
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-22 19:35:54
Message-ID: 5498729A.7070908@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 12/21/2014 02:18 PM, Tom Lane wrote:
> =?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= <fabriziomello(at)gmail(dot)com> writes:
>> I work with some customer that have databases with a lot of schemas and
>> sometimes we need to run manual VACUUM in one schema, and would be nice to
>> have a new option to run vacuum in relations from a specific schema.
> I'm pretty skeptical of this alleged use-case. Manual vacuuming ought
> to be mostly a thing of the past, and even if it's not, hitting
> *everything* in a schema should seldom be an appropriate thing to do.
>
> While the feature itself might be fairly innocuous, I'm just wondering
> why we need to encourage manual vacuuming. And why that, but not
> say schema-wide ANALYZE, CLUSTER, TRUNCATE, ...
>
>

Sadly, manual vacuuming is very far from a thing of the past. Autovacuum
simply doesn't give us enough control in many cases.

Maybe this gadget would be useful, but its application seems a bit
limited. Someone mentioned allowing multiple --table options to
vacuumdb. That would be mopre flexible.

But really I think we need to work on how we can make autovacuum more
useful. For example, it would be nice not to have to do "ALTER TABLE" to
change the autovac settings. It would be nice to be able to specify
times of day and days of week when autovacuum should be turned on or off
for a table. I'm sure there are plenty of other ideas.

cheers

andrew


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <fabriziomello(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-22 22:00:38
Message-ID: 54989486.2060802@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/22/14, 10:05 AM, Andres Freund wrote:
>> While the feature itself might be fairly innocuous, I'm just wondering
>> >why we need to encourage manual vacuuming. And why that, but not
>> >say schema-wide ANALYZE, CLUSTER, TRUNCATE, ...
> There's one argument for supporting more for VACUUM than the rest - it
> can't be executed directly as the result of a query as the others
> can... I wonder if that'd not better be answered by adding a feature to
> vacuumdb that allows selecting the to-be-vacuumed table by a user
> defined query.

I would MUCH rather that we find a way to special-case executing non-transactional commands dynamically, because VACUUM isn't the only one that suffers from this problem.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: <fabriziomello(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-22 22:02:57
Message-ID: 54989511.9030806@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/21/14, 8:55 PM, Fabrízio de Royes Mello wrote:
> > And why that, but not
> > say schema-wide ANALYZE, CLUSTER, TRUNCATE, ...
> >
>
> +1. I can write patches for each of this maintenance statement too.
>
>
> If we're going to go that route, then perhaps it would make more sense to create a command that allows you to apply a second command to every object in a schema. We would have to be careful about PreventTransactionChain commands.
>
>
> Sorry but I don't understand what you meant. Can you explain more about your idea?

There's a very large number of commands that could be useful to execute on every object in a schema. (RE)INDEX, CLUSTER, ALTER come to mind besides VACUUM.

Right now a lot of people just work around this with things like DO blocks, but as mentioned elsewhere in the thread that fails for commands that can't be in a transaction.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-23 13:44:17
Message-ID: CA+Tgmoa8r+CK_uKHVjC9ZGwdowZwAd19oX0b+SctqM-U9NywNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 22, 2014 at 5:00 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
> I would MUCH rather that we find a way to special-case executing
> non-transactional commands dynamically, because VACUUM isn't the only one
> that suffers from this problem.

Is pg_background a solution to this problem?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: José Luis Tallón <jltallon(at)adv-solutions(dot)net>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-23 13:54:22
Message-ID: CA+TgmoYCG00YHq0zAnkCTD6S6Znt+LSC2mon3HGsG7CsHNRJpw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 22, 2014 at 11:51 AM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Multi-table CLUSTER uses multiple transactions, so this should not be an
> issue. That said, I don't think there's much point in CLUSTER SCHEMA,
> much less TRUNCATE SCHEMA. Do you normally organize your schemas so
> that there are some that contain only tables that need to be truncated
> together? That would be a strange use case.
>
> Overall, this whole line of development seems like bloating the parse
> tables for little gain.

We added REINDEX SCHEMA less than three weeks ago; if we accept that
that was a good change, but think this is a bad one, it's not clear to
me that there is any guiding principle here beyond who happened to
weigh in on which threads.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-23 14:54:01
Message-ID: CAFcNs+r35Hifiv0egdSGz3PWP-yh-XO6ULrGiNgdEBnF04vEjw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 22, 2014 at 8:02 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
>
> On 12/21/14, 8:55 PM, Fabrízio de Royes Mello wrote:
>>
>> > And why that, but not
>> > say schema-wide ANALYZE, CLUSTER, TRUNCATE, ...
>> >
>>
>> +1. I can write patches for each of this maintenance statement
too.
>>
>>
>> If we're going to go that route, then perhaps it would make more
sense to create a command that allows you to apply a second command to
every object in a schema. We would have to be careful about
PreventTransactionChain commands.
>>
>>
>> Sorry but I don't understand what you meant. Can you explain more
about your idea?
>
>
> There's a very large number of commands that could be useful to execute
on every object in a schema. (RE)INDEX, CLUSTER, ALTER come to mind besides
VACUUM.
>

ANALYZE too...

> Right now a lot of people just work around this with things like DO
blocks, but as mentioned elsewhere in the thread that fails for commands
that can't be in a transaction.
>

I use "dblink" to solve it. :-)

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: José Luis Tallón <jltallon(at)adv-solutions(dot)net>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-23 15:28:35
Message-ID: 20141223152835.GN1768@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas wrote:
> On Mon, Dec 22, 2014 at 11:51 AM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
> > Multi-table CLUSTER uses multiple transactions, so this should not be an
> > issue. That said, I don't think there's much point in CLUSTER SCHEMA,
> > much less TRUNCATE SCHEMA. Do you normally organize your schemas so
> > that there are some that contain only tables that need to be truncated
> > together? That would be a strange use case.
> >
> > Overall, this whole line of development seems like bloating the parse
> > tables for little gain.
>
> We added REINDEX SCHEMA less than three weeks ago; if we accept that
> that was a good change, but think this is a bad one, it's not clear to
> me that there is any guiding principle here beyond who happened to
> weigh in on which threads.

I didn't think much of REINDEX SCHEMA, TBH.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: <fabriziomello(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-24 01:06:25
Message-ID: 549A1191.1060901@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/23/14, 8:54 AM, Fabrízio de Royes Mello wrote:
> > Right now a lot of people just work around this with things like DO blocks, but as mentioned elsewhere in the thread that fails for commands that can't be in a transaction.
> >
>
> I use "dblink" to solve it. :-)

So... how about instead of solving this only for vacuum we create something generic? :) Possibly using Robert's background worker work?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-24 01:09:27
Message-ID: 549A1247.2030802@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/23/14, 7:44 AM, Robert Haas wrote:
> On Mon, Dec 22, 2014 at 5:00 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
>> I would MUCH rather that we find a way to special-case executing
>> non-transactional commands dynamically, because VACUUM isn't the only one
>> that suffers from this problem.
>
> Is pg_background a solution to this problem?

Yes, since it allows you to do "autonomous transactions". It's probably not the most efficient way to solve this, but it should work.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-24 02:49:40
Message-ID: CAFcNs+pE=Sdi_s8E0Y449U_nerfRXmOyBMSg4UtW7o+bY1HyTQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Em terça-feira, 23 de dezembro de 2014, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
escreveu:

> On 12/23/14, 8:54 AM, Fabrízio de Royes Mello wrote:
>
>> > Right now a lot of people just work around this with things like DO
>> blocks, but as mentioned elsewhere in the thread that fails for commands
>> that can't be in a transaction.
>> >
>>
>> I use "dblink" to solve it. :-)
>>
>
> So... how about instead of solving this only for vacuum we create
> something generic? :) Possibly using Robert's background worker work?

Interesting idea.

But and what about the idea of improve the "--table" option from clients:
vaccumdb and clusterdb?

Regards,

Fabrízio Mello

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: <fabriziomello(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-24 18:53:06
Message-ID: 549B0B92.4@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/23/14, 8:49 PM, Fabrízio de Royes Mello wrote:
> Em terça-feira, 23 de dezembro de 2014, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com <mailto:Jim(dot)Nasby(at)bluetreble(dot)com>> escreveu:
>
> On 12/23/14, 8:54 AM, Fabrízio de Royes Mello wrote:
>
> > Right now a lot of people just work around this with things like DO blocks, but as mentioned elsewhere in the thread that fails for commands that can't be in a transaction.
> >
>
> I use "dblink" to solve it. :-)
>
>
> So... how about instead of solving this only for vacuum we create something generic? :) Possibly using Robert's background worker work?
>
>
> Interesting idea.
>
> But and what about the idea of improve the "--table" option from clients: vaccumdb and clusterdb?

Seems reasonable.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Oskari Saarenmaa <os(at)ohmu(dot)fi>
To: fabriziomello(at)gmail(dot)com, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal "VACUUM SCHEMA"
Date: 2014-12-28 15:38:45
Message-ID: 54A02405.1010402@ohmu.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

21.12.2014, 18:48, Fabrízio de Royes Mello kirjoitti:
> I work with some customer that have databases with a lot of schemas and
> sometimes we need to run manual VACUUM in one schema, and would be nice
> to have a new option to run vacuum in relations from a specific schema.
>
> The new syntax could be:
>
> VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] { [ table_name ] | SCHEMA
> schema_name }
>
> Also I'll add a new option to "vacuumdb" client:
>
> -S, --schema=SCHEMA
>
> I can work on this feature to 2015/02 CF.
>
> Thoughts?

This would be useful for ANALYZE to make it easier to run analyze only
for the interesting schemas after a pg_upgrade. I have a database with
most of the actively used data in the "public" schema and a number of
rarely accessed large logging and archive tables in other schemas. It'd
be useful to prioritize analyzing the main tables before doing anything
about the rarely used schemas to allow the database to be put back into
production as soon as possible.

/ Oskari