Re: [PATCHES] VACUUM Improvements - WIP Patch

Lists: pgsql-hackerspgsql-patches
From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: VACUUM Improvements - WIP Patch
Date: 2008-06-10 05:32:48
Message-ID: 2e78013d0806092232h6ca15ffejcbcd24e88401308f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Here is a WIP patch based on the discussions here:
http://archives.postgresql.org/pgsql-hackers/2008-05/msg00863.php

The attached WIP patch improves the LAZY VACUUM by limiting or
avoiding the second heap scan. This not only saves considerable time
in VACUUM, but also reduces the double-writes of vacuumed blocks. If
the second heap scan is considerably limited, that should also save
CPU usage and reduce WAL log writing.

With HOT, the first heap scan prunes and defrags every page in the
heap. That truncates all the dead tuples to their DEAD line pointers
and releases all the free space in the page. The second scan only
removes these DEAD line pointers and records the free space in the
FSM. The free space in fact does not change from the first pass. But
to do so, it again calls RepairPageFragmentation on each page, dirties
the page and calls log_heap_clean() again on the page. This clearly
looks like too much work for a small gain.

As this patch stands, the first phase of vacuum prunes the heap pages
as usual. But it marks the DEAD line pointers as DEAD_RECLAIMED to
signal that the index pointers to these line pointers are being
removed, if certain conditions are satisfied. Other backend when
prunes a page, also reclaims DEAD_RECLAIMED line pointers by marking
them UNUSED. We need some additional logic to do this in a safe way:

- An additional boolean pg_class attribute (relvac_inprogress) is used
to track the status of vacuum on a relation. If the attribute is true,
either vacuum is in progress on the relation or the last vacuum did
not complete successfully.

When VACUUM starts, it sets relvac_inprogress to true. The transaction
is committed and a new transaction is started so that all other
backends can see the change. We also note down the transactions which
may already have the table open. VACUUM then starts the first heap
scan. It prunes the page, but it can start marking the DEAD line
pointers as DEAD_RECLAIMED only after it knows that all other backends
can see that VACUUM is in progress on the target relation. Otherwise
there is a danger that backends might reclaim DEAD line pointers
before their index pointers are removed and that would lead to index
corruption. We do that by periodic conditional waits on the noted
transactions ids. Once all old transactions are gone, VACUUM sets the
second scan limit to the current block number and starts marking
subsequent DEAD line pointers as DEAD_RECLAIMED.

In most of the cases where the old transactions quickly go away, and
for large tables, the second scan will be very limited. In the worst
case, we might incur the overhead of conditional waits without any
success.

TODO:

- We can potentially update FSM at the end of first pass. This is not
a significant issue if the second scan is very limited. But if we do
this, we need to handle the truncate case properly.

- As the patch stands, we check of old transactions at every block
iteration. This might not be acceptable for the cases where there are
long running transactions. We probably need some exponential gap here.

- As the patch stands, the heap_page_prune handles reclaiming the
DEAD_RECLAIMED line pointers since it already has ability to WAL log
similar changes. We don't do any extra work to trigger pruning though
(except than setting page_prune_xid). May be we should trigger pruning
if we got a line pointer bloat in a page too.

Please let me know comments/suggestions and any other improvements.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
VACUUM_second_scan-v5.patch.gz application/x-gzip 10.2 KB

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: VACUUM Improvements - WIP Patch
Date: 2008-07-01 07:41:34
Message-ID: 1214898094.3845.537.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


On Tue, 2008-06-10 at 11:02 +0530, Pavan Deolasee wrote:

> In most of the cases where the old transactions quickly go away, and
> for large tables, the second scan will be very limited. In the worst
> case, we might incur the overhead of conditional waits without any
> success.

Looks good.

> - An additional boolean pg_class attribute (relvac_inprogress) is used
> to track the status of vacuum on a relation. If the attribute is true,
> either vacuum is in progress on the relation or the last vacuum did
> not complete successfully.
>
> When VACUUM starts, it sets relvac_inprogress to true.

What happens if the last VACUUM crashed? Any negative effects? If so,
should autovac be triggered again soon to complete the failed VACUUM?

> - We can potentially update FSM at the end of first pass. This is not
> a significant issue if the second scan is very limited. But if we do
> this, we need to handle the truncate case properly.

Not sure why would we do that? What would that give? To do that you'd
need to completely redesign FSM since it assumes only one update would
take place.

> - As the patch stands, we check of old transactions at every block
> iteration. This might not be acceptable for the cases where there are
> long running transactions. We probably need some exponential gap here.

I would make vacuum_delay_point() return bool rather than void, then you
can do the check each time we do the delay by saying:

if (vacuum_delay_point())
{

Need to change VacuumCostActive so it is always active during a VACUUM,
so we do accounting even when vacuum wait is zero.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc: pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: VACUUM Improvements - WIP Patch
Date: 2008-07-12 17:32:13
Message-ID: 21751.1215883933@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
> Here is a WIP patch based on the discussions here:
> http://archives.postgresql.org/pgsql-hackers/2008-05/msg00863.php

I do not like this patch in any way, shape, or form.

(1) It's enormously overcomplicated, and therefore fragile.

(2) It achieves speedup of VACUUM by pushing work onto subsequent
regular accesses of the page, which is exactly the wrong thing.
Worse, once you count the disk writes those accesses will induce it's
not even clear that there's any genuine savings.

(3) The fact that it doesn't work until concurrent transactions have
gone away makes it of extremely dubious value in real-world scenarios,
as already noted by Simon.

It strikes me that what you are trying to do here is compensate for
a bad decision in the HOT patch, which was to have VACUUM's first
pass prune/defrag a page even when we know we are going to have to
come back to that page later. What about trying to fix things so
that if the page contains line pointers that need to be removed,
the first pass doesn't dirty it at all, but leaves all the work
to be done at the second visit? I think that since heap_page_prune
has been refactored into a "scan" followed by an "apply", it'd be
possible to decide before the "apply" step whether this is the case
or not.

regards, tom lane


From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] VACUUM Improvements - WIP Patch
Date: 2008-07-14 03:50:02
Message-ID: 2e78013d0807132050j16cdc558s4dc3f889371a937d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

(taking the discussions to -hackers)

On Sat, Jul 12, 2008 at 11:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>
> (2) It achieves speedup of VACUUM by pushing work onto subsequent
> regular accesses of the page, which is exactly the wrong thing.
> Worse, once you count the disk writes those accesses will induce it's
> not even clear that there's any genuine savings.
>

Well in the worst case that is true. But in most other cases, the
second pass work will be combined with other normal activities and the
overhead will be shared, at least there is a chance for that. I think
there is a chance for delaying the work until there is any real need
for that e.g. INSERT or UPDATE on the page which would require a free
line pointer.

> (3) The fact that it doesn't work until concurrent transactions have
> gone away makes it of extremely dubious value in real-world scenarios,
> as already noted by Simon.
>

If there are indeed long running concurrent transactions, we won't get
any benefit of this optimization. But then there are several more
common cases of very short concurrent transactions. In those cases and
for very large tables, reducing the vacuum time is a significant win.
The FSM will be written early and significant work of the VACUUM can
be finished quickly.

> It strikes me that what you are trying to do here is compensate for
> a bad decision in the HOT patch, which was to have VACUUM's first
> pass prune/defrag a page even when we know we are going to have to
> come back to that page later. What about trying to fix things so
> that if the page contains line pointers that need to be removed,
> the first pass doesn't dirty it at all, but leaves all the work
> to be done at the second visit? I think that since heap_page_prune
> has been refactored into a "scan" followed by an "apply", it'd be
> possible to decide before the "apply" step whether this is the case
> or not.
>

I am not against this idea. Just that it still requires us double scan
of the main table and that's exactly what we are trying to avoid with
this patch.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc: "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] VACUUM Improvements - WIP Patch
Date: 2008-07-14 15:23:45
Message-ID: 8741.1216049025@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
> (taking the discussions to -hackers)
> On Sat, Jul 12, 2008 at 11:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> (2) It achieves speedup of VACUUM by pushing work onto subsequent
>> regular accesses of the page, which is exactly the wrong thing.
>> Worse, once you count the disk writes those accesses will induce it's
>> not even clear that there's any genuine savings.

> Well in the worst case that is true. But in most other cases, the
> second pass work will be combined with other normal activities and the
> overhead will be shared, at least there is a chance for that. I think
> there is a chance for delaying the work until there is any real need
> for that e.g. INSERT or UPDATE on the page which would require a free
> line pointer.

That's just arm-waving: right now, pruning will be done by the next
*reader* of the page, whether or not he has any intention of *writing*
it. With no proposal on the table for improving that situation,
I don't see any credibility in arguing for over-complicating VACUUM
on the grounds that it might happen someday. In any case, the work
that is supposed to be done by VACUUM is being pushed to a foreground
query, which I find to be completely against our design principles.

>> It strikes me that what you are trying to do here is compensate for
>> a bad decision in the HOT patch, which was to have VACUUM's first
>> pass prune/defrag a page even when we know we are going to have to
>> come back to that page later. What about trying to fix things so
>> that if the page contains line pointers that need to be removed,
>> the first pass doesn't dirty it at all, but leaves all the work
>> to be done at the second visit?

> I am not against this idea. Just that it still requires us double scan
> of the main table and that's exactly what we are trying to avoid with
> this patch.

The part of the argument that I found convincing was trying to reduce
the write traffic (especially WAL log output), not avoiding a second
read. And the fundamental point still remains: the work should be done
in background, not foreground.

regards, tom lane


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] VACUUM Improvements - WIP Patch
Date: 2008-07-14 17:15:42
Message-ID: 87hcasmloh.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

>>> It strikes me that what you are trying to do here is compensate for
>>> a bad decision in the HOT patch, which was to have VACUUM's first
>>> pass prune/defrag a page even when we know we are going to have to
>>> come back to that page later. What about trying to fix things so
>>> that if the page contains line pointers that need to be removed,
>>> the first pass doesn't dirty it at all, but leaves all the work
>>> to be done at the second visit?
>
>> I am not against this idea. Just that it still requires us double scan
>> of the main table and that's exactly what we are trying to avoid with
>> this patch.
>
> The part of the argument that I found convincing was trying to reduce
> the write traffic (especially WAL log output), not avoiding a second
> read. And the fundamental point still remains: the work should be done
> in background, not foreground.

I like the idea of only having to do a single pass through the table though.
Couldn't Pavan's original plan still work and just not have other clients try
to remove dead line pointers? At least not unless they're also pruning the
page due to an insert or update anyways?

Fundamentally it does seem like we want to rotate vacuum's work-load. What
we're doing now is kind of backwards.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] VACUUM Improvements - WIP Patch
Date: 2008-07-14 17:55:00
Message-ID: 28739.1216058100@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> I like the idea of only having to do a single pass through the table though.

Well, that argument was already overstated: we're not re-reading all of
the table now. Just the pages containing dead line pointers.

> Couldn't Pavan's original plan still work and just not have other clients try
> to remove dead line pointers?

You could simply delay recycling of the really-truly-dead line pointers
until the next VACUUM, I suppose. It's not clear how bad a
line-pointer-bloat problem that might leave you with. (It would still
require tracking whether the last vacuum had completed successfully.
I note that any simple approach to that would foreclose ever doing
partial-table vacuums, which is something I thought was on the table
as soon as we had dead space mapping ability.)

> At least not unless they're also pruning the
> page due to an insert or update anyways?

Please stop pretending that this overhead will only be paid by
insert/update. The current design for pruning does not work that way,
and we do not have a better design available.

regards, tom lane


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] VACUUM Improvements - WIP Patch
Date: 2008-07-14 18:24:14
Message-ID: 87bq10nx2p.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Gregory Stark <stark(at)enterprisedb(dot)com> writes:
>> I like the idea of only having to do a single pass through the table though.
>
> Well, that argument was already overstated: we're not re-reading all of
> the table now. Just the pages containing dead line pointers.
>
>> Couldn't Pavan's original plan still work and just not have other clients try
>> to remove dead line pointers?
>
> You could simply delay recycling of the really-truly-dead line pointers
> until the next VACUUM, I suppose. It's not clear how bad a
> line-pointer-bloat problem that might leave you with. (It would still
> require tracking whether the last vacuum had completed successfully.
> I note that any simple approach to that would foreclose ever doing
> partial-table vacuums, which is something I thought was on the table
> as soon as we had dead space mapping ability.)

Well there were three suggestions on how to track whether the last vacuum
committed or not. Keeping the last vacuum id in pg_class, keeping it per-page,
and keeping it per line pointer. ISTM either of the latter two would work with
partial table vacuums. Per line-pointer xids seemed excessively complicated to
me but per-page vacuum ids doesn't seem problematic.

I would definitely agree that partial-table vacuums are an essential part of
the future.

>> At least not unless they're also pruning the
>> page due to an insert or update anyways?
>
> Please stop pretending that this overhead will only be paid by
> insert/update. The current design for pruning does not work that way,
> and we do not have a better design available.

Well I'm not pretending. I guess there's something I'm missing here. I thought
we only pruned when we wanted to insert a new tuple and found not enough
space.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Gregory Stark" <stark(at)enterprisedb(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] VACUUM Improvements - WIP Patch
Date: 2008-07-15 08:23:50
Message-ID: 487C5E96.1020907@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Gregory Stark wrote:
> I thought
> we only pruned when we wanted to insert a new tuple and found not enough
> space.

Nope, we prune on any access to the page, if the page is "full enough",
and the pd_prune_xid field suggests that there is something to prune.

The problem with only pruning on inserts is that by the time we get to
heap_insert/heap_update, we're already holding a pin on the page, which
prevents us from acquiring the vacuum lock.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] VACUUM Improvements - WIP Patch
Date: 2008-08-23 03:36:24
Message-ID: 200808230336.m7N3aOc15945@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


I assume there is no TODO here.

---------------------------------------------------------------------------

Pavan Deolasee wrote:
> (taking the discussions to -hackers)
>
> On Sat, Jul 12, 2008 at 11:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> >
> > (2) It achieves speedup of VACUUM by pushing work onto subsequent
> > regular accesses of the page, which is exactly the wrong thing.
> > Worse, once you count the disk writes those accesses will induce it's
> > not even clear that there's any genuine savings.
> >
>
> Well in the worst case that is true. But in most other cases, the
> second pass work will be combined with other normal activities and the
> overhead will be shared, at least there is a chance for that. I think
> there is a chance for delaying the work until there is any real need
> for that e.g. INSERT or UPDATE on the page which would require a free
> line pointer.
>
>
> > (3) The fact that it doesn't work until concurrent transactions have
> > gone away makes it of extremely dubious value in real-world scenarios,
> > as already noted by Simon.
> >
>
> If there are indeed long running concurrent transactions, we won't get
> any benefit of this optimization. But then there are several more
> common cases of very short concurrent transactions. In those cases and
> for very large tables, reducing the vacuum time is a significant win.
> The FSM will be written early and significant work of the VACUUM can
> be finished quickly.
>
> > It strikes me that what you are trying to do here is compensate for
> > a bad decision in the HOT patch, which was to have VACUUM's first
> > pass prune/defrag a page even when we know we are going to have to
> > come back to that page later. What about trying to fix things so
> > that if the page contains line pointers that need to be removed,
> > the first pass doesn't dirty it at all, but leaves all the work
> > to be done at the second visit? I think that since heap_page_prune
> > has been refactored into a "scan" followed by an "apply", it'd be
> > possible to decide before the "apply" step whether this is the case
> > or not.
> >
>
> I am not against this idea. Just that it still requires us double scan
> of the main table and that's exactly what we are trying to avoid with
> this patch.
>
> Thanks,
> Pavan
>
> --
> Pavan Deolasee
> EnterpriseDB http://www.enterprisedb.com
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: "Bruce Momjian" <bruce(at)momjian(dot)us>
Cc: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] VACUUM Improvements - WIP Patch
Date: 2008-08-24 13:30:18
Message-ID: b42b73150808240630r59ad34f3l24a980280cdc52d9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Fri, Aug 22, 2008 at 11:36 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> I assume there is no TODO here.

Well, there doesn't seem to be a TODO for partial/restartable vacuums,
which were mentioned upthread. This is a really desirable feature for
big databases and removes one of the reasons to partition large
tables.

merlin


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] VACUUM Improvements - WIP Patch
Date: 2008-08-24 15:49:51
Message-ID: 48B1831F.7010002@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Merlin Moncure wrote:
> On Fri, Aug 22, 2008 at 11:36 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> I assume there is no TODO here.
>
> Well, there doesn't seem to be a TODO for partial/restartable vacuums,
> which were mentioned upthread. This is a really desirable feature for
> big databases and removes one of the reasons to partition large
> tables.

I would agree that partial vacuums would be very useful.

Joshua D. Drake

>
> merlin
>


From: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] VACUUM Improvements - WIP Patch
Date: 2008-08-24 18:40:23
Message-ID: 48B1AB17.9010905@zeut.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joshua D. Drake wrote:
> Merlin Moncure wrote:
>> Well, there doesn't seem to be a TODO for partial/restartable vacuums,
>> which were mentioned upthread. This is a really desirable feature for
>> big databases and removes one of the reasons to partition large
>> tables.
> I would agree that partial vacuums would be very useful.

I think everyone agrees that partial vacuums would be useful / *A Good
Thing* but it's the implementation that is the issue. I was thinking
about Alvaro's recent work to make vacuum deal with TOAST tables
separately, which is almost like a partial vacuum since it effectively
splits the vacuum work up into multiple independent blocks of work, the
limitation obviously being that it can only split the work around
TOAST. Is there anyway that vacuum could work per relfile since we
already split tables into files that are never greater than 1G? I would
think that if Vacuum never had more than 1G of work to do at any given
moment it would make it much more manageable.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] VACUUM Improvements - WIP Patch
Date: 2008-08-24 19:16:08
Message-ID: 25792.1219605368@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Matthew T. O'Connor" <matthew(at)zeut(dot)net> writes:
> I think everyone agrees that partial vacuums would be useful / *A Good
> Thing* but it's the implementation that is the issue.

I'm not sure how important it will really be once we have support for
dead-space-map-driven vacuum.

regards, tom lane


From: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] VACUUM Improvements - WIP Patch
Date: 2008-08-25 02:31:37
Message-ID: 48B21989.9090200@zeut.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> "Matthew T. O'Connor" <matthew(at)zeut(dot)net> writes:
>
>> I think everyone agrees that partial vacuums would be useful / *A Good
>> Thing* but it's the implementation that is the issue.
>>
>
> I'm not sure how important it will really be once we have support for
> dead-space-map-driven vacuum.

Is that something we can expect any time soon? I haven't heard much
about it really happening for 8.4.


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] VACUUM Improvements - WIP Patch
Date: 2008-08-25 07:24:01
Message-ID: 48B25E11.5080803@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> [ off-list ]
>
> "Matthew T. O'Connor" <matthew(at)zeut(dot)net> writes:
>> Tom Lane wrote:
>>> I'm not sure how important it will really be once we have support for
>>> dead-space-map-driven vacuum.
>
>> Is that something we can expect any time soon? I haven't heard much
>> about it really happening for 8.4.
>
> AFAIK Heikki has every intention of making it happen for 8.4. I'll
> let him answer on-list, though.

I do. Unfortunately I've been swamped, and still am, with other stuff,
and have had zero time to work on it :-(.

My current plan is to finish the FSM rewrite for/during the September
commit fest, and submit a patch for dead-space-map driven VACUUM for the
November commit fest.

My original plan was to enable index-only-scans using the DSM as well
for 8.4, but it's pretty clear at this point that I don't have the time
to finish that :-(.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Abhijit Menon-Sen <ams(at)oryx(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] VACUUM Improvements - WIP Patch
Date: 2008-09-01 06:29:29
Message-ID: 48BB8BC9.7080508@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Abhijit Menon-Sen wrote:
> At 2008-08-25 10:24:01 +0300, heikki(at)enterprisedb(dot)com wrote:
>> My original plan was to enable index-only-scans using the DSM as well
>> for 8.4, but it's pretty clear at this point that I don't have the
>> time to finish that :-(.
>
> I wonder how hard that would be.

It's doable, for sure.

The pieces I see as required for that are:
1. change the indexam API so that indexes can return tuples
2. make sure the DSM is suitable for index-only-scans. Ie. it must be
completely up-to-date and WAL-logged, so that if the DSM says that all
tuples on a page are visible, they really must be.
3. planner/stats changes, so that the planner can estimate how much of
an index scan can be satisfied without looking at the heap (it's not an
all-or-nothing plan-time decision with this design)

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com