Re: [DEFAULT] Daily digest v1.4346 (20 messages)

From: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
To: josh(at)agliodbs(dot)com, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [DEFAULT] Daily digest v1.4346 (20 messages)
Date: 2004-03-23 02:45:57
Message-ID: 405FA4E5.3080305@zeut.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Josh Berkus wrote:

>>Inability to customize thresholds on a per table basis
>>
>>
>
>This hasn't been a big problem for me. I would judge that 80% of my clients
>would make no use of this feature.
>
>
>
Ok.

>>Inability to set default thresholds on a per database basis
>>
>>
>
>This would be much more useful to us.
>
>
>
So interesting, most uses request the per table settings, guess there
is sufficient demand for both.

>>Inability to schedule vacuums during off-peak times
>>
>>
>
>I don't think that this is the job of pg_autovacuum. If a database requires
>bulk loads and other burst activity, the DBA should schedule manual vacuums
>around those and not use pg_autovacuum.
>
>
>
You might be missing the point, the advantage of using pg_autovacuum is
that it wouldn't waste cycles doing vacuums on tables that don't need
it. If we have persistent data (saving state information periodically)
then this is a very easy feature to add.

>>Lack of integration related to startup and shutdown
>>
>>
>
>Yes, this is a pain, especially from a security standpoint.
>
>
>
Yes, backend integration will make this go away.

>>Ignorance of VACUUM and ANALYZE operations performed outside pg_autovacuum
>>
>>
>(requires backend integration? or can listen / notify can be used?)
>
>Again, I think this is not crucial, personally. Nice if there's some easy
>way to do it, of course.
>
>
>
What I'm thinking is that the VACUUM command could be modified to write
down some data from the stats system at vacuum time. Once the VACUUM
command writes this down for itself then pg_autovacuum just uses that
number to make its decision. Again, we are trying to reduce as much as
possible superfluous vacuums. If an admin vacuums his whole cluster
every Sunday night that may prevent lots of vacuums occurring during
business hours that effect processing.

>>Lack of logging options / syslog integration / log rotation options
>>
>>
>
>Yep, this is a biggie.
>
>
>
Agreed. This is another issue that could be solved with backend
integration.

>Now, let me add my comments as to what my clients have complained about:
>
>-- Lack of integrated security with the Postmaster
>-- Inability to detect VACUUMs "backing up" due to too low vacuum mem or too
>much activity and warn the DBA
>-- Inability to Vacuum in parallel on high-capacity machines
>-- No "timeout" for locked vacuums.
>
>
>
Backend integration should solve the 1st issue. Parallel vacuums is
something that could be worked on at some point. Would it make sense
to incorporate this with tablespaces? The vacuum daemon would only
issue one vacuum command per tablespace, but could issue as many
parallel vacuums as you have independent tablespaces.

I think timeout issue would need to be a part of vacuum proper, and I'm
not sure about the "backing up" issue.

>>Since many people do not like tools that clutter their databases by adding
>>
>>
>tables, I think option 1 (adding a pg_autovacuum table to existing databases)
>is right out.
>
>Personally, I like the idea of a pg_autovacuum table, and would supporrt it.
>However, I have no strong objections to the other approaches.
>
>
>
I think I was unclear, I agree the creation of a pg_autovacuum system
table is fine (if we really need it), but my initial post was talking
about keeping pg_autovacuum as a client app, hence the autovacuum table
would be added into (clutter up) the users table space, not the systems.

>I think we've already had feedback about this. If it's system information, it
>should go in one of the existing tables, or it should be called something
>more descriptive than "table_data", and should begin with pg_
>
>
>
I wasn't really suggesting table_data as the real name, but again this
will be more straightforward once integrated in.

>Some consideraiton should also be given to the frequency of updating the
>persistent data. I would favor an asynchnous, infrequent updating that
>would permit some loss of information over a synchrnous lossless approach.
>The latter, while more accurate, would detract from server performance on
>high-volume transction databases.
>
>
>
Agreed, the performance impact of this should be negligible.

>>3.Single-Pass Mode (External Scheduling):
>>
>>I have received requests to be able to run pg_autovacuum only on request
>>
>>
>I think this is a completely different utility from pg_autovacuum, and this
>line of development need not be pursued unless it's easy to do. I
>certainly don't need it ....
>
>
>
The reason it's similar is that once pg_autovacuum data is persistent,
it would be trivial to implement this feature, and the data that any
tool would need to make these decisions is the same as what
pg_autovacuum is already tracking.

>>Syslog support. I'm not sure this is really needed, but a simple patch was
>>
>>
>I need it, and am glad to hear there is a patch. Several of my clients use
>centralized syslog servers, and do *everything* through syslog.
>
>
>
I think the patch was submitted to either the hackers or patches list.
If you can't find it, I'll look around and see if I still have a copy.
The person who submitted said it was simple, but was working for him in
production.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthew T. O'Connor 2004-03-23 02:49:22 Re: [DEFAULT] Daily digest v1.4346 (20 messages)
Previous Message Christopher Kings-Lynne 2004-03-23 02:34:03 Re: float8 regression test failure in head