Re: [HACKERS] Automatic free space map filling

Lists: pgsql-hackerspgsql-patches
From: "Zeugswetter Andreas DCP SD" <ZeugswetterA(at)spardat(dot)at>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>
Cc: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Automatic free space map filling
Date: 2006-03-02 08:53:57
Message-ID: E1539E0ED7043848906A8FF995BDA579D989B5@m0143.s-mxs.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


> I thought we had sufficiently destroyed that "reuse a tuple"
> meme yesterday. You can't do that: there are too many
> aspects of the system design that are predicated on the
> assumption that dead tuples do not come back to life. You
> have to do the full vacuuming bit (index entry removal,
> super-exclusive page locking, etc) before you can remove a dead tuple.

One more idea I would like to throw in.
Ok, we cannot reuse a dead tuple. Maybe we can reuse the space of a dead
tuple by reducing the tuple to it's header info.
(If you still wanted to be able to locate index entries fast,
you would need to keep indexed columns, but I think we agreed that there
is
no real use)

I think that would be achievable at reasonable cost (since you can avoid
one page IO)
on the page of the currently active tuple (the first page that is
considered).

On this page:
if freespace available
--> use it
elsif freespace available after reducing all dead rows
--> use the freespace with a new slot
else ....

Of course this only works when we still have free slots,
but I think that might not really be an issue.

Andreas


From: Hannu Krosing <hannu(at)skype(dot)net>
To: Zeugswetter Andreas DCP SD <ZeugswetterA(at)spardat(dot)at>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Automatic free space map filling
Date: 2006-03-02 09:07:00
Message-ID: 1141290420.3737.23.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Ühel kenal päeval, N, 2006-03-02 kell 09:53, kirjutas Zeugswetter
Andreas DCP SD:
> > I thought we had sufficiently destroyed that "reuse a tuple"
> > meme yesterday. You can't do that: there are too many
> > aspects of the system design that are predicated on the
> > assumption that dead tuples do not come back to life. You
> > have to do the full vacuuming bit (index entry removal,
> > super-exclusive page locking, etc) before you can remove a dead tuple.
>
> One more idea I would like to throw in.
> Ok, we cannot reuse a dead tuple. Maybe we can reuse the space of a dead
> tuple by reducing the tuple to it's header info.
> (If you still wanted to be able to locate index entries fast,
> you would need to keep indexed columns, but I think we agreed that there
> is
> no real use)

I don't even think you need the header, just truncate the slot to be
0-size (the next pointer is the same as this one or make the pointer
point to unaligned byte or smth) and detect this condition when
accessing tuples. this would add on compare to all accesse to the tuple,
but I suspect that mostly it is a noop performance-wise as all data
needed is already available in level1 cache.

This would decouple declaring a tuple to be dead/reuse data space and
final cleanup/free index space.

--------------------
Hannu


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Hannu Krosing <hannu(at)skype(dot)net>
Cc: Zeugswetter Andreas DCP SD <ZeugswetterA(at)spardat(dot)at>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Automatic free space map filling
Date: 2006-03-02 15:05:28
Message-ID: 2657.1141311928@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hannu Krosing <hannu(at)skype(dot)net> writes:
> hel kenal peval, N, 2006-03-02 kell 09:53, kirjutas Zeugswetter
> Andreas DCP SD:
>> Ok, we cannot reuse a dead tuple. Maybe we can reuse the space of a dead
>> tuple by reducing the tuple to it's header info.

> I don't even think you need the header, just truncate the slot to be
> 0-size

I think you must keep the header because the tuple might be part of an
update chain (cf vacuuming bugs we repaired just a few months ago).
t_ctid is potentially interesting data even in a certainly-dead tuple.

Andreas' idea is possibly doable but I am not sure that I see the point.
It does not reduce the need for vacuum nor the I/O load imposed by
vacuum. What it does do is bias the system in the direction of
allocating an unreasonably large number of tuple line pointers on a page
(ie, more than are useful when the page is fully packed with normal
tuples). Since we never reclaim such pointers, over time all the pages
in a table would tend to develop line-pointer-bloat. I don't know what
the net overhead would be, but it'd definitely impose some aggregate
inefficiency.

regards, tom lane


From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, Zeugswetter Andreas DCP SD <ZeugswetterA(at)spardat(dot)at>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Automatic free space map filling
Date: 2006-03-04 01:49:25
Message-ID: 20060304014924.GY82012@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Thu, Mar 02, 2006 at 10:05:28AM -0500, Tom Lane wrote:
> Hannu Krosing <hannu(at)skype(dot)net> writes:
> > hel kenal peval, N, 2006-03-02 kell 09:53, kirjutas Zeugswetter
> > Andreas DCP SD:
> >> Ok, we cannot reuse a dead tuple. Maybe we can reuse the space of a dead
> >> tuple by reducing the tuple to it's header info.
>
> Andreas' idea is possibly doable but I am not sure that I see the point.
> It does not reduce the need for vacuum nor the I/O load imposed by
> vacuum. What it does do is bias the system in the direction of
> allocating an unreasonably large number of tuple line pointers on a page
> (ie, more than are useful when the page is fully packed with normal
> tuples). Since we never reclaim such pointers, over time all the pages
> in a table would tend to develop line-pointer-bloat. I don't know what
> the net overhead would be, but it'd definitely impose some aggregate
> inefficiency.

What would be involved in reclaiming item pointer space? Is there any
reason it's not done today? (I know I've been bit once by this...)
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Automatic free space map filling
Date: 2006-03-09 06:52:18
Message-ID: 20060309152300.4C45.ITAGAKI.TAKAHIRO@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Zeugswetter Andreas DCP SD" <ZeugswetterA(at)spardat(dot)at> wrote:

> Ok, we cannot reuse a dead tuple. Maybe we can reuse the space of a dead
> tuple by reducing the tuple to it's header info.

I was just working about your idea. In my work, bgwriter truncates
dead tuples and leaves only their headers. I'll send a concept patch
to PATCHES.

We must take super-exclusive-lock of pages before vacuum. Bgwriter tries to
take exclusive-lock before it writes a page, and does vacuum only if the lock
is super-exclusive. Otherwise, it gives up and writes normally. This is an
optimistic way, but I assume the possibility is high because the most pages
written by bgwriter are least recently used (LRU).

Also, I changed bgwriter_lru_maxpages to be adjusted automatically, because
backends won't do vacuum not to disturb main transaction processing,
so bgwriter should write most of the dirty pages.

There are much room for discussion on this idea.
Comments are welcome.

---
ITAGAKI Takahiro
NTT Cyber Space Laboratories


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>
To: pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Automatic free space map filling
Date: 2006-03-09 06:53:28
Message-ID: 20060309152716.4C48.ITAGAKI.TAKAHIRO@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Zeugswetter Andreas DCP SD" <ZeugswetterA(at)spardat(dot)at> wrote:

> Ok, we cannot reuse a dead tuple. Maybe we can reuse the space of a dead
> tuple by reducing the tuple to it's header info.

Attached patch realizes the concept of his idea. The dead tuples will be
reduced to their headers are done by bgwriter.

This patch is incomplete, so please discuss in the thread on HACKERS.

---
ITAGAKI Takahiro
NTT Cyber Space Laboratories

Attachment Content-Type Size
bgvacuum-0309.patch.txt application/octet-stream 32.5 KB

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Automatic free space map filling
Date: 2006-03-10 15:41:28
Message-ID: 1142005289.16417.23.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Thu, 2006-03-09 at 15:53 +0900, ITAGAKI Takahiro wrote:
> "Zeugswetter Andreas DCP SD" <ZeugswetterA(at)spardat(dot)at> wrote:
>
> > Ok, we cannot reuse a dead tuple. Maybe we can reuse the space of a dead
> > tuple by reducing the tuple to it's header info.
>
> Attached patch realizes the concept of his idea. The dead tuples will be
> reduced to their headers are done by bgwriter.
>
> This patch is incomplete, so please discuss in the thread on HACKERS.

I'm interested in this patch but you need to say more about it. I get
the general idea but it would be useful if you could give a full
description of what this patch is trying to do and why.

Thanks,

Best Regards, Simon Riggs


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] Automatic free space map filling
Date: 2006-03-13 08:38:01
Message-ID: 20060313172901.4A08.ITAGAKI.TAKAHIRO@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

> > "Zeugswetter Andreas DCP SD" <ZeugswetterA(at)spardat(dot)at> wrote:
> > > Ok, we cannot reuse a dead tuple. Maybe we can reuse the space of a dead
> > > tuple by reducing the tuple to it's header info.
> >
> > Attached patch realizes the concept of his idea. The dead tuples will be
> > reduced to their headers are done by bgwriter.
>
> I'm interested in this patch but you need to say more about it. I get
> the general idea but it would be useful if you could give a full
> description of what this patch is trying to do and why.

OK, I try to explain the patch. Excuse me for a long writing.

* Purpose
The basic idea is just "reducing the dead tuple to it's header info",
suggested by Andreas. This is a lightweight per-page sweeping to reduce
the consumption of free space map and the necessity of VACUUM; i.e,
normal VACUUM is still needed occasionally.

I think it is useful on heavy-update workloads. It showed 5-10% of
performance improvement on DBT-2 after 9 hours running *without* vacuum.
I don't know whether it is still effective with well-scheduled vacuum.

* Why does bgwriter do vacuum?
Sweeping has cost, so non-backend process should do. Also, the page worth
vacuum are almost always dirty, because tuples on the page are just updated
or deleted. Bgwriter treats dirty pages, so I think it is a good place for
sweeping.

* Locking
We must take super-exclusive-lock of the pages before vacuum. In the patch,
bgwriter tries to take exclusive-lock before it writes a page, and does
vacuum only if the lock is super-exclusive. Otherwise, it gives up and
writes the pages normally. This is an optimistic way, but I assume the
possibility is high because the most pages written by bgwriter are least
recently used (LRU).

* Keep the headers
We cannot remove dead tuples completely in per-page sweep, because
references to the tuples from indexes still remains. We might keep only
line pointers (4 bytes), but it might lead line-pointer-bloat problems,
(http://archives.postgresql.org/pgsql-hackers/2006-03/msg00116.php).
so the headers (4+32 byte) should be left.

* Other twists and GUC variables in the patch
- Bgwriter cannot access the catalogs, so I added BM_RELATION hint bit
to BufferDesc. Only relation pages will be swept. This is enabled by
GUC variable 'bgvacuum_relation'.
- I changed bgwriter_lru_maxpages to be adjusted automatically. Backends
won't do vacuum not to disturb their processing, so bgwriter should write
most of dirty pages. ('bgvacuum_autotune')
- After sweepping, the page will be added to free space map. I made a simple
replacement algorithm of free space map, that replaces the page with least
spaces near the added one. ('bgvacuum_fsm')

* Issues
- If WAL is produced by sweeping a page, writing the page should be pended
for a while, because flushing the WAL is needed before writing the page.
- Bgwriter writes pages in 4 contexts, background-writes for LRU, ALL,
checkpoint and shutdown. In current patch, pages are swept in 3 contexts
except shutdown, but it may be better to do only on LRU.

* Related discussions
- Real-Time Vacuum Possibility (Rod Taylor)
http://archives.postgresql.org/pgsql-hackers/2005-03/msg00518.php
| have the bgwriter take a look at the pages it has, and see if it can do
| any vacuum work based on pages it is about to send to disk
- Pre-allocated free space for row updating (like PCTFREE) (Satoshi Nagayasu)
http://archives.postgresql.org/pgsql-hackers/2005-08/msg01135.php
| light-weight repairing on a single page is needed to maintain free space
- Dead Space Map (Heikki Linnakangas)
http://archives.postgresql.org/pgsql-hackers/2006-02/msg01125.php
| vacuuming pages one by one as they're written by bgwriter

Thank you for reading till the last.
I'd like to hear your comments.

---
ITAGAKI Takahiro
NTT Cyber Space Laboratories


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-patches(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] Automatic free space map filling
Date: 2006-03-14 08:55:40
Message-ID: 1142326540.11178.38.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Mon, 2006-03-13 at 17:38 +0900, ITAGAKI Takahiro wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> > > "Zeugswetter Andreas DCP SD" <ZeugswetterA(at)spardat(dot)at> wrote:
> > > > Ok, we cannot reuse a dead tuple. Maybe we can reuse the space of a dead
> > > > tuple by reducing the tuple to it's header info.
> > >
> > > Attached patch realizes the concept of his idea. The dead tuples will be
> > > reduced to their headers are done by bgwriter.
> >
> > I'm interested in this patch but you need to say more about it. I get
> > the general idea but it would be useful if you could give a full
> > description of what this patch is trying to do and why.
>
> OK, I try to explain the patch. Excuse me for a long writing.

OK. I'll take a look at this, thanks.

Best Regards, Simon Riggs


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Automatic free space map filling
Date: 2006-06-16 18:48:59
Message-ID: 200606161848.k5GImx212417@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Added to TODO list with URL.

---------------------------------------------------------------------------

ITAGAKI Takahiro wrote:
> "Zeugswetter Andreas DCP SD" <ZeugswetterA(at)spardat(dot)at> wrote:
>
> > Ok, we cannot reuse a dead tuple. Maybe we can reuse the space of a dead
> > tuple by reducing the tuple to it's header info.
>
> Attached patch realizes the concept of his idea. The dead tuples will be
> reduced to their headers are done by bgwriter.
>
> This patch is incomplete, so please discuss in the thread on HACKERS.
>
> ---
> ITAGAKI Takahiro
> NTT Cyber Space Laboratories
>

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +