Re: Size for vacuum_mem

Lists: pgsql-general
From: "Francisco Reyes" <lists(at)natserv(dot)com>
To: "neilc(at)samurai(dot)com" <neilc(at)samurai(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Size for vacuum_mem
Date: 2002-12-05 17:57:16
Message-ID: 200212052035.gB5KZpo17961@mx2.drf.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 5 Dec 2002, Neil Conway wrote:

> For these, it would probably be faster to TRUNCATE the table and then
> load the new data, then ANALYZE.

I can't because those tables would not be usable during the load.
Right now I do delete/copy from within a transaction. If the loads are
still running while people start coming in the morning they can still do
work.

> > while other tables I delete/reload about 1/3 (ie
> > 7 Million records table I delete/copy 1.5 Million records).
>
> For these, you can try just using a plain VACUUM and seeing how
> effective that is at reclaiming space.

I am not too concerned with space reclamation. In theory if I don't do
vacuum fulls I may have some dead space, but it would get re-used daily.
My concern is the performance hit I would suffer with the table scans.

>If necessary, increase max_fsm_pages.

What is this setting for? To what number could I increase it to?

> You might also want to check and see if your indexes are growing in size
> (btree indexes on incrementally increasing values like timestamps can
> grow, even with VACUUM FULL); use REINDEX if that's the case.

Every once in a while I truncate the tables and re-load the whole set.
Probably about every couple of months.


From: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To: Francisco Reyes <lists(at)natserv(dot)com>
Cc: "neilc(at)samurai(dot)com" <neilc(at)samurai(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Size for vacuum_mem
Date: 2002-12-05 21:57:56
Message-ID: 1039125476.11130.179.camel@camel
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, 2002-12-05 at 12:57, Francisco Reyes wrote:
> > For these, you can try just using a plain VACUUM and seeing how
> > effective that is at reclaiming space.
>
> I am not too concerned with space reclamation. In theory if I don't do
> vacuum fulls I may have some dead space, but it would get re-used daily.
> My concern is the performance hit I would suffer with the table scans.
>

you should see very little performance impact from lazy vacuuming. If
there is a performance hit, you can gain some offset by quicker queries
(if you do vacuum analyze). And remember, lazy vacuums are non-blocking
so users won't see an impact from that standpoint. The trick is to find
a good interval that will keep your tables from growing too big. I have
one table that updates every 10 minutes (the whole content of the table
gets updated within 15 minutes), which keeps the size very manageable
(it's not a huge table, or I would do it more). In this scenario, you
can still do vacuum fulls if you feel the need, but they should take
much less time.

Robert Treat


From: "David Blood" <david(at)matraex(dot)com>
To: "'Robert Treat'" <xzilla(at)users(dot)sourceforge(dot)net>, "'Francisco Reyes'" <lists(at)natserv(dot)com>
Cc: <neilc(at)samurai(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Size for vacuum_mem
Date: 2002-12-05 23:11:35
Message-ID: 005801c29cb3$ac060570$1f00a8c0@redwood
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

A "lazy vacuum" can hurt If you have lots of i/o. If we try to run it
during the day it kills us. This is because to vacuum all the tables
postgres has to read them from the disk. While it doesn't not lock rows
it does block other rows from reading/writing to/from the disk.

-----Original Message-----
From: pgsql-general-owner(at)postgresql(dot)org
[mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of Robert Treat
Sent: Thursday, December 05, 2002 2:58 PM
To: Francisco Reyes
Cc: neilc(at)samurai(dot)com; pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] Size for vacuum_mem

On Thu, 2002-12-05 at 12:57, Francisco Reyes wrote:
> > For these, you can try just using a plain VACUUM and seeing how
> > effective that is at reclaiming space.
>
> I am not too concerned with space reclamation. In theory if I don't do
> vacuum fulls I may have some dead space, but it would get re-used
daily.
> My concern is the performance hit I would suffer with the table scans.
>

you should see very little performance impact from lazy vacuuming. If
there is a performance hit, you can gain some offset by quicker queries
(if you do vacuum analyze). And remember, lazy vacuums are non-blocking
so users won't see an impact from that standpoint. The trick is to find
a good interval that will keep your tables from growing too big. I have
one table that updates every 10 minutes (the whole content of the table
gets updated within 15 minutes), which keeps the size very manageable
(it's not a huge table, or I would do it more). In this scenario, you
can still do vacuum fulls if you feel the need, but they should take
much less time.

Robert Treat

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)


From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: David Blood <david(at)matraex(dot)com>
Cc: "'Robert Treat'" <xzilla(at)users(dot)sourceforge(dot)net>, "'Francisco Reyes'" <lists(at)natserv(dot)com>, <neilc(at)samurai(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Size for vacuum_mem
Date: 2002-12-05 23:59:37
Message-ID: Pine.LNX.4.33.0212051657010.18114-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, 5 Dec 2002, David Blood wrote:

> A "lazy vacuum" can hurt If you have lots of i/o. If we try to run it
> during the day it kills us. This is because to vacuum all the tables
> postgres has to read them from the disk. While it doesn't not lock rows
> it does block other rows from reading/writing to/from the disk.

How much shared memory do you have allocated to Postgresql?

I've found that if I have a couple hundred megs of shared buffer on a
machine with 1 gig or more of ram, that lazy vacuums (in 7.2.x and later,
7.1 has massive problems with lazy vacuums acting up) don't seem to affect
performance much at all.

Vacuum on most my boxen results in no more than a 5% performance loss for
other queries (all types, select, update, delete, insert) but keeps the
database running well, even if they are running one vacuum right after
another.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "David Blood" <david(at)matraex(dot)com>
Cc: "'Robert Treat'" <xzilla(at)users(dot)sourceforge(dot)net>, "'Francisco Reyes'" <lists(at)natserv(dot)com>, neilc(at)samurai(dot)com, pgsql-general(at)postgresql(dot)org
Subject: Re: Size for vacuum_mem
Date: 2002-12-06 19:56:47
Message-ID: 12426.1039204607@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

"David Blood" <david(at)matraex(dot)com> writes:
> A "lazy vacuum" can hurt If you have lots of i/o. If we try to run it
> during the day it kills us. This is because to vacuum all the tables
> postgres has to read them from the disk. While it doesn't not lock rows
> it does block other rows from reading/writing to/from the disk.

On the other hand, I have watched people lazy-vacuum production
databases in 7.2.* and not seen any visible hit on system load
(as far as top or vmstat could show, anyway).

I think it may be a matter of whether you have disk bandwidth to
spare. If the disk farm is marginal, the extra demand from a vacuum
may push you over the knee of the performance curve. But that's just
a guess. It would be interesting if some folks from the "it doesn't
hurt" and the "it does hurt" camps could compare notes and try to
understand the reason for the difference in their results.

regards, tom lane


From: "Nicolai Tufar" <ntufar(at)apb(dot)com(dot)tr>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Size for vacuum_mem
Date: 2002-12-06 20:22:12
Message-ID: 002c01c29d65$2933e600$8016a8c0@apb.com.tr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> On the other hand, I have watched people lazy-vacuum production
> databases in 7.2.* and not seen any visible hit on system load
> (as far as top or vmstat could show, anyway).

From my experience lazy-vacuum never strains system load with
modern processors. It is defenitely IO-bound.

regards,
Nic


From: Steve Atkins <steve(at)blighty(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Size for vacuum_mem
Date: 2002-12-07 07:52:17
Message-ID: 20021206235217.A7702@blighty.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Fri, Dec 06, 2002 at 02:56:47PM -0500, Tom Lane wrote:
> "David Blood" <david(at)matraex(dot)com> writes:
> > A "lazy vacuum" can hurt If you have lots of i/o. If we try to run it
> > during the day it kills us. This is because to vacuum all the tables
> > postgres has to read them from the disk. While it doesn't not lock rows
> > it does block other rows from reading/writing to/from the disk.
>
> On the other hand, I have watched people lazy-vacuum production
> databases in 7.2.* and not seen any visible hit on system load
> (as far as top or vmstat could show, anyway).
>
> I think it may be a matter of whether you have disk bandwidth to
> spare. If the disk farm is marginal, the extra demand from a vacuum
> may push you over the knee of the performance curve. But that's just
> a guess. It would be interesting if some folks from the "it doesn't
> hurt" and the "it does hurt" camps could compare notes and try to
> understand the reason for the difference in their results.

I'm firmly in the "devastating to performance" camp.

7.2.3, reasonably well-tuned on a not overspecced, but adequate
Solaris box. (Built with a non-Solaris qsort, though I doubt that's
relevant).

Several large-ish (hundreds of thousands to millions of rows), fairly
heavily updated tables, with some text fields large enough to push
data out to toast.

Vacuumed pretty much continuously while data is being updated, so it
didn't get too far out to lunch.

Then the process updating the table were shut down, so the system was
basically idle, and the tables were vacuumed. Simple selects (from
some small tables, via psql) slowed to a crawl - tens of seconds to
get any response. There was a lot of I/O but also high CPU usage -
including a fair fraction of system time.

It felt like the system was i/o-starved, yet it would run a very
intensive DB app quite happily if it wasn't vacuuming.

(I finally rewrote the algorithm to avoid UPDATE, instead storing
deltas in a daily table, then every night reading all the deltas and
all the archived data and inserting the merged data into a new archive
table (then indexing it and renaming to replace the old archived data
table). Ugly, and offended my SQL sensibilities, but it avoided having
to keep that table vacuumed.)

Cheers,
Steve


From: "Peter Darley" <pdarley(at)kinesis-cem(dot)com>
To: "David Blood" <david(at)matraex(dot)com>, "'Robert Treat'" <xzilla(at)users(dot)sourceforge(dot)net>, "'Francisco Reyes'" <lists(at)natserv(dot)com>
Cc: <neilc(at)samurai(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Slow Lazy Vacuum (was Size for vacuum_mem)
Date: 2002-12-09 23:02:49
Message-ID: NNEAICKPNOGDBHNCEDCPOEADDDAA.pdarley@kinesis-cem.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Friends,
I just want to throw my support behind what David said. We have
particularly slow drives, but a lot of memory (our data all fits into the
cache) because of the failure of our fast drive and our inability to replace
it (we're just using a single slow ide drive now), and the system fly's
except while doing a vacuum. Just showing an example of when the lazy
vacuum can significantly slow things.
Thanks,
Peter Darley

-----Original Message-----
From: pgsql-general-owner(at)postgresql(dot)org
[mailto:pgsql-general-owner(at)postgresql(dot)org]On Behalf Of David Blood
Sent: Thursday, December 05, 2002 3:12 PM
To: 'Robert Treat'; 'Francisco Reyes'
Cc: neilc(at)samurai(dot)com; pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] Size for vacuum_mem

A "lazy vacuum" can hurt If you have lots of i/o. If we try to run it
during the day it kills us. This is because to vacuum all the tables
postgres has to read them from the disk. While it doesn't not lock rows
it does block other rows from reading/writing to/from the disk.

-----Original Message-----
From: pgsql-general-owner(at)postgresql(dot)org
[mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of Robert Treat
Sent: Thursday, December 05, 2002 2:58 PM
To: Francisco Reyes
Cc: neilc(at)samurai(dot)com; pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] Size for vacuum_mem

On Thu, 2002-12-05 at 12:57, Francisco Reyes wrote:
> > For these, you can try just using a plain VACUUM and seeing how
> > effective that is at reclaiming space.
>
> I am not too concerned with space reclamation. In theory if I don't do
> vacuum fulls I may have some dead space, but it would get re-used
daily.
> My concern is the performance hit I would suffer with the table scans.
>

you should see very little performance impact from lazy vacuuming. If
there is a performance hit, you can gain some offset by quicker queries
(if you do vacuum analyze). And remember, lazy vacuums are non-blocking
so users won't see an impact from that standpoint. The trick is to find
a good interval that will keep your tables from growing too big. I have
one table that updates every 10 minutes (the whole content of the table
gets updated within 15 minutes), which keeps the size very manageable
(it's not a huge table, or I would do it more). In this scenario, you
can still do vacuum fulls if you feel the need, but they should take
much less time.

Robert Treat

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
message can get through to the mailing list cleanly