Re: DROP TABLE and autovacuum

Lists: pgsql-hackerspgsql-patches
From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: DROP TABLE and autovacuum
Date: 2007-06-13 06:28:51
Message-ID: 20070613141752.22EC.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

If we tries to drop the table on which autovacuum is running, we have to
wait finish of the vacuum. However, the vacuuming effort goes to waste for
the table being dropped or rewritten. Meanwhile, we've already had the
autovacuum killer triggered in CREATE/DROP/RENAME DATABASE commands.
Can we extend the feature to several TABLE commands?

One simple solution is that every time a non-autovacuum backend tries to
access a table with a lock equal or stronger than SHARE UPDATE EXCLUSIVE,
the backend checks whether some autovacuum workers are vacuuming the table
and send SIGINT to them.

Is this worth doing? Or are there any dangerous situation in it?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: DROP TABLE and autovacuum
Date: 2007-06-13 13:48:12
Message-ID: 20070613134812.GI4640@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

ITAGAKI Takahiro wrote:
> If we tries to drop the table on which autovacuum is running, we have to
> wait finish of the vacuum. However, the vacuuming effort goes to waste for
> the table being dropped or rewritten. Meanwhile, we've already had the
> autovacuum killer triggered in CREATE/DROP/RENAME DATABASE commands.
> Can we extend the feature to several TABLE commands?
>
> One simple solution is that every time a non-autovacuum backend tries to
> access a table with a lock equal or stronger than SHARE UPDATE EXCLUSIVE,
> the backend checks whether some autovacuum workers are vacuuming the table
> and send SIGINT to them.
>
> Is this worth doing? Or are there any dangerous situation in it?

Well, one problem with this is that currently SIGINT cancels the whole
autovacuum worker, not just the table currently being processed. I
think this can be fixed easily by improving the signal handling.

Aside from that, I don't see any problem in handling DROP TABLE like you
suggest. But I don't feel comfortable with doing it with just any
strong locker, because that would easily starve tables from being
vacuumed at all.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: DROP TABLE and autovacuum
Date: 2007-06-13 13:51:15
Message-ID: 9302.1181742675@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> If we tries to drop the table on which autovacuum is running, we have to
> wait finish of the vacuum. However, the vacuuming effort goes to waste for
> the table being dropped or rewritten. Meanwhile, we've already had the
> autovacuum killer triggered in CREATE/DROP/RENAME DATABASE commands.
> Can we extend the feature to several TABLE commands?

> One simple solution is that every time a non-autovacuum backend tries to
> access a table with a lock equal or stronger than SHARE UPDATE EXCLUSIVE,
> the backend checks whether some autovacuum workers are vacuuming the table
> and send SIGINT to them.

I don't think this is a good idea at all. You're proposing putting a
dangerous sledgehammer into a core part of the system in order to fix a
fairly minor annoyance.

For the specific case of DROP TABLE, a SIGINT might be a good idea
but I don't agree with it for any weaker action.

regards, tom lane


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: DROP TABLE and autovacuum
Date: 2007-06-14 09:07:07
Message-ID: 20070614174225.6A6B.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Alvaro Herrera <alvherre(at)commandprompt(dot)com> wrote:

> ITAGAKI Takahiro wrote:
> > autovacuum killer triggered in CREATE/DROP/RENAME DATABASE commands.
> > Can we extend the feature to several TABLE commands?
>
> Well, one problem with this is that currently SIGINT cancels the whole
> autovacuum worker, not just the table currently being processed. I
> think this can be fixed easily by improving the signal handling.

There is no difference between SIGINT and SIGTERM against autovacuum
workers presently. I'm thinking to split their effects -- SIGINT to
'skip the current table' and SIGTERM to 'cancel all tables'.

BTW, if autovacuum workers are signaled by an internal server activity,
we will see 'ERROR: canceling statement due to user request' in server log.
Is it surprising to users? I prefer quiet shutdown to ERROR logs.

> Aside from that, I don't see any problem in handling DROP TABLE like you
> suggest. But I don't feel comfortable with doing it with just any
> strong locker, because that would easily starve tables from being
> vacuumed at all.

Hmm, how about canceling only the cases of DROP TABLE, TRUNCATE and CLUSTER.
We will obviously not need the table after the commands. Other commands,
VACUUM (FULL), ANALYZE, CREATE INDEX (CONCURRENTLY), REINDEX and LOCK TABLE
still conflict with autovacuum, but I'll leave it as-is in the meantime.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: DROP TABLE and autovacuum
Date: 2007-06-14 13:29:55
Message-ID: 20070614132955.GB5105@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

ITAGAKI Takahiro wrote:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> wrote:
>
> > ITAGAKI Takahiro wrote:
> > > autovacuum killer triggered in CREATE/DROP/RENAME DATABASE commands.
> > > Can we extend the feature to several TABLE commands?
> >
> > Well, one problem with this is that currently SIGINT cancels the whole
> > autovacuum worker, not just the table currently being processed. I
> > think this can be fixed easily by improving the signal handling.
>
> There is no difference between SIGINT and SIGTERM against autovacuum
> workers presently. I'm thinking to split their effects -- SIGINT to
> 'skip the current table' and SIGTERM to 'cancel all tables'.

Sure, we can do that (it's even mentioned in the comments in autovacuum.c).

> BTW, if autovacuum workers are signaled by an internal server activity,
> we will see 'ERROR: canceling statement due to user request' in server log.
> Is it surprising to users? I prefer quiet shutdown to ERROR logs.

Maybe cancelling the current table processing should be just a WARNING,
not ERROR. We would abort the transaction quietly.

> > Aside from that, I don't see any problem in handling DROP TABLE like you
> > suggest. But I don't feel comfortable with doing it with just any
> > strong locker, because that would easily starve tables from being
> > vacuumed at all.
>
> Hmm, how about canceling only the cases of DROP TABLE, TRUNCATE and CLUSTER.
> We will obviously not need the table after the commands. Other commands,
> VACUUM (FULL), ANALYZE, CREATE INDEX (CONCURRENTLY), REINDEX and LOCK TABLE
> still conflict with autovacuum, but I'll leave it as-is in the meantime.

Well, all of DROP TABLE, TRUNCATE and CLUSTER seem safe -- and also,
they will advance the table's relfrozenxid. No objection there.

I think all the others you mention should be waiting on autovacuum, not
cancel it. Maybe what we could do with VACUUM and ANALYZE is let the
user know that the table is being processed by autovacuum and return
quickly.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: DROP TABLE and autovacuum
Date: 2007-06-14 14:12:09
Message-ID: 20070614141209.GG5105@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Alvaro Herrera wrote:
> ITAGAKI Takahiro wrote:

> > Hmm, how about canceling only the cases of DROP TABLE, TRUNCATE and CLUSTER.
> > We will obviously not need the table after the commands. Other commands,
> > VACUUM (FULL), ANALYZE, CREATE INDEX (CONCURRENTLY), REINDEX and LOCK TABLE
> > still conflict with autovacuum, but I'll leave it as-is in the meantime.
>
> Well, all of DROP TABLE, TRUNCATE and CLUSTER seem safe -- and also,
> they will advance the table's relfrozenxid. No objection there.

Something worth considering, though unrelated to the topic at hand: what
happens with the table stats after CLUSTER? Should we cause an ANALYZE
afterwards? We could end up running with outdated statistics.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: DROP TABLE and autovacuum
Date: 2007-06-21 03:27:22
Message-ID: 20070621115823.6E6B.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Alvaro Herrera <alvherre(at)commandprompt(dot)com> wrote:

> Something worth considering, though unrelated to the topic at hand: what
> happens with the table stats after CLUSTER? Should we cause an ANALYZE
> afterwards? We could end up running with outdated statistics.

We don't invalidate the value statistics in pg_stats by ANALYZE presently.

Also, the runtime statistics are not invalidated -- it cound be a bug.
pgstat_drop_relation() is expecting relid (pg_class.oid) as the argument,
but we pass it relfilenode.

[storage/smgr/smgr.c]
static void
smgr_internal_unlink(RelFileNode rnode, int which, bool isTemp, bool isRedo)
{
...
/*
* Tell the stats collector to forget it immediately, too. Skip this in
* recovery mode, since the stats collector likely isn't running (and if
* it is, pgstat.c will get confused because we aren't a real backend
* process).
*/
if (!InRecovery)
pgstat_drop_relation(rnode.relNode);

...
}

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: pgstat_drop_relation bugfix
Date: 2007-06-22 05:31:46
Message-ID: 20070622140217.612A.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

I wrote:
> the runtime statistics are not invalidated -- it cound be a bug.
> pgstat_drop_relation() is expecting relid (pg_class.oid) as the argument,
> but we pass it relfilenode.
>
> [storage/smgr/smgr.c]
> smgr_internal_unlink(RelFileNode rnode, int which, bool isTemp, bool isRedo)
> pgstat_drop_relation(rnode.relNode);

I'm trying to fix the bug, because there is a possibility that some useless
statistics data continue to occupy some parts of the statistics table.

The bugfix itself is not so difficult; we only need to add a relid field
to PendingRelDelete structure and pass it to pgstat_drop_relation().
However, there are some design issues here.

- smgr need to know relation oid not only relfilenode.
This might brake the module independency.

- What should we do on TRUNCATE, CLUSTER and rewriting table?
The runtime statistics are still valid after those commands
practically, but we discard them in the current logic.
TRUNCATE should be set both n_live_tup and n_dead_tup to zero.
CLUSTER and rewriting taable should be set only n_dead_tup to zero.
But it might be good to keep other statistics (# of scans).

Are there any other more clever ways?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-patches(at)postgresql(dot)org
Subject: pgstat_drop_relation bugfix
Date: 2007-06-26 11:19:22
Message-ID: 20070626195741.6A3D.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

I wrote:
> > pgstat_drop_relation() is expecting relid (pg_class.oid) as the argument,
> > but we pass it relfilenode.
> I'm trying to fix the bug, because there is a possibility that some useless
> statistics data continue to occupy some parts of the statistics table.

Here is a patch to fix undropped entries in the runtime statistics table.
Now smgr records the relation oids and uses them to drop entries
corresponding the relations.

We could make stray entries easily using the following queries.
CREATE TABLE test (i integer);
INSERT INTO test VALUES(1);
TRUNCATE test;
DROP TABLE test;

Since we don't discard any of stat entries except pg_stat_reset(),
those useless entries would cause memory leaks, though it is very trivial.

I fell my fix is uncool; Let me know if there are any other better ways.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment Content-Type Size
pgstat_drop_relation.patch application/octet-stream 12.1 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: pgstat_drop_relation bugfix
Date: 2007-07-03 16:43:58
Message-ID: 9564.1183481038@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> I wrote:
>>> pgstat_drop_relation() is expecting relid (pg_class.oid) as the argument,
>>> but we pass it relfilenode.

> Here is a patch to fix undropped entries in the runtime statistics table.
> Now smgr records the relation oids and uses them to drop entries
> corresponding the relations.
> ...
> I fell my fix is uncool; Let me know if there are any other better ways.

I don't like this patch either. It's ugly because it introduces
relation OIDs into a level of the system that should only be dealing in
relfilenodes --- which I think is not just cosmetic, it adds hazards of
using the wrong one (which indeed is exactly the type of bug we're
trying to fix...) It's not even correct: the patch for
setNewRelfilenode specifies the rel OID to both the smgrcreate and
smgrscheduleunlink calls, which means that the stats collector will be
told to drop its stats for that OID on either commit or rollback.
Surely not what we'd like. Now that could be fixed, but it illustrates
that this is actually a very fragile way of thinking about and dealing
with the problem.

I thought for awhile about changing the pgstats stuff to identify
relations by RelFileNode instead of OID. That has a number of
attractions --- you could argue that it's more correct when dealing with
reassigned relfilenodes, for instance. But it'd be a lot of work and
it would mean changing the signatures of all the pg_stat_get_foo()
functions, which would break people's custom stats views.

What I'm inclined to propose is that we just remove the
pgstat_drop_relation() call from smgr_internal_unlink, and rely entirely
on VACUUM to clean out dead entries in the pgstats data.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org
Subject: Re: pgstat_drop_relation bugfix
Date: 2007-07-08 22:27:49
Message-ID: 17624.1183933669@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

I wrote:
> What I'm inclined to propose is that we just remove the
> pgstat_drop_relation() call from smgr_internal_unlink, and rely entirely
> on VACUUM to clean out dead entries in the pgstats data.

On checking the CVS history, that in fact is how the code worked before
8.1.3, when I introduced the bogus call in a fit of over-optimization :-(.
I vaguely recall thinking that the chance of losing data would be small
and pgstat is subject to losing data anyway. However, we are definitely
moving (slowly) in the direction of making pgstat more trustworthy, so
it's probably best not to take the risk of dropping useful stats.

Call removed from 8.1.x and up.

regards, tom lane


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: pgstat_drop_relation bugfix
Date: 2007-07-13 08:08:24
Message-ID: 20070713170023.8E14.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> > What I'm inclined to propose is that we just remove the
> > pgstat_drop_relation() call from smgr_internal_unlink
> On checking the CVS history, that in fact is how the code worked before
> 8.1.3, when I introduced the bogus call in a fit of over-optimization :-(.

I'm worried that we leave garbage entries in the stats. As you suggested,
don't we need to remove unreferenced entries from stats periodically?

> > and rely entirely
> > on VACUUM to clean out dead entries in the pgstats data.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: pgstat_drop_relation bugfix
Date: 2007-07-13 13:46:32
Message-ID: 20835.1184334392@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> I'm worried that we leave garbage entries in the stats. As you suggested,
> don't we need to remove unreferenced entries from stats periodically?

VACUUM does that already.

regards, tom lane