Re: Is anybody actually using XLR_BKP_REMOVABLE?

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Is anybody actually using XLR_BKP_REMOVABLE?
Date: 2011-12-12 15:17:17
Message-ID: 18626.1323703037@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Back in 2007 (commit a8d539f12498de52453c8113892cbf48cc62478d), we
reduced the maximum number of backup blocks per WAL record from 4 to 3,
in order to permit addition of an XLR_BKP_REMOVABLE flag bit that
purports to show whether it's safe to suppress full-page-image backup
blocks in an external WAL-filtering program. I'm having a problem with
this now because I don't see any way to get SP-GiST's leaf page
splitting operation to touch less than four pages. So I'd like to
propose reverting that decision and again allowing a WAL record to touch
as many as four pages.

As the above commit message points out, compression of this sort could
only be done externally if that external program knew the complete
details of every single type of WAL record, so that it could figure out
whether any data needed to be extracted from the full-page images and
reinserted in the WAL record. I'm not convinced that anybody has
written such a thing or will be able to maintain it into the future,
as we feel free to whack around the contents of WAL on a regular basis.
The commit message claims that such a program would be posted and
maintained on pgfoundry, but I couldn't find any trace of it (not that
pgfoundry's search tools are very good, but neither "compress", "xlog"
or "wal" produces any hits on such a thing). In any case I think the
modern theory about this is you should get a filesystem that prevents
torn-page writes, and then you can just turn off full_page_writes.

Furthermore, what the XLR_BKP_REMOVABLE bit actually reports is just
whether a backup operation is in progress, and I think we have now (or
easily could) add reporting records to the WAL stream that tell when a
backup starts or stops. So external compression would still be possible
if it kept a bit more state around.

So: is there actually any such compression program out there?
Would anybody really cry if this flag went away?

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Is anybody actually using XLR_BKP_REMOVABLE?
Date: 2011-12-12 15:36:49
Message-ID: CA+U5nMKsjrkC1ezSh5sva-onQmqJ4bQ0sUAU3PrYQbCq-Coiug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 12, 2011 at 3:17 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Furthermore, what the XLR_BKP_REMOVABLE bit actually reports is just
> whether a backup operation is in progress, and I think we have now (or
> easily could) add reporting records to the WAL stream that tell when a
> backup starts or stops.  So external compression would still be possible
> if it kept a bit more state around.
>
> So: is there actually any such compression program out there?
> Would anybody really cry if this flag went away?

Yes, WAL records could be invented to mark the boundaries, so yes,
IMHO it is OK to make that flag go away.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Is anybody actually using XLR_BKP_REMOVABLE?
Date: 2011-12-12 15:42:59
Message-ID: 19108.1323704579@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On Mon, Dec 12, 2011 at 3:17 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Furthermore, what the XLR_BKP_REMOVABLE bit actually reports is just
>> whether a backup operation is in progress, and I think we have now (or
>> easily could) add reporting records to the WAL stream that tell when a
>> backup starts or stops. So external compression would still be possible
>> if it kept a bit more state around.
>>
>> So: is there actually any such compression program out there?
>> Would anybody really cry if this flag went away?

> Yes, WAL records could be invented to mark the boundaries, so yes,
> IMHO it is OK to make that flag go away.

It occurs to me also that we could just move the flag from
per-WAL-record info bytes to per-page or even per-segment WAL headers.
Because we now force a segment switch when starting a backup, the
flag would be seen turned-on soon enough to prevent problems.
Finding out that it's off again after the end of a backup might be
a little delayed, but the only cost is failure to compress a few
compressible records.

I'm not volunteering to do the above, unless someone steps forward
to say that there's active use of this flag, but either one of these
solutions seems more tenable than using up an info-byte bit.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Is anybody actually using XLR_BKP_REMOVABLE?
Date: 2011-12-12 15:58:52
Message-ID: CA+U5nM+_0AWVO0a_wirh1T86tNvN8bei-rrvdO1egYFiiunv5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 12, 2011 at 3:42 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>> On Mon, Dec 12, 2011 at 3:17 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Furthermore, what the XLR_BKP_REMOVABLE bit actually reports is just
>>> whether a backup operation is in progress, and I think we have now (or
>>> easily could) add reporting records to the WAL stream that tell when a
>>> backup starts or stops.  So external compression would still be possible
>>> if it kept a bit more state around.
>>>
>>> So: is there actually any such compression program out there?
>>> Would anybody really cry if this flag went away?
>
>> Yes, WAL records could be invented to mark the boundaries, so yes,
>> IMHO it is OK to make that flag go away.
>
> It occurs to me also that we could just move the flag from
> per-WAL-record info bytes to per-page or even per-segment WAL headers.
> Because we now force a segment switch when starting a backup, the
> flag would be seen turned-on soon enough to prevent problems.
> Finding out that it's off again after the end of a backup might be
> a little delayed, but the only cost is failure to compress a few
> compressible records.
>
> I'm not volunteering to do the above, unless someone steps forward
> to say that there's active use of this flag, but either one of these
> solutions seems more tenable than using up an info-byte bit.

I'll volunteer. Assume you can reuse the flag and I will patch afterwards.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Jesper Krogh <jesper(at)krogh(dot)cc>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hackers <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Is anybody actually using XLR_BKP_REMOVABLE?
Date: 2011-12-12 16:05:19
Message-ID: 54F4DEBB-60B8-46AD-A2FF-A531BC19B412@krogh.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>
>
> So: is there actually any such compression program out there?
> Would anybody really cry if this flag went away?

Perhaps http://pglesslog.projects.postgresql.org/

Jesper


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jesper Krogh <jesper(at)krogh(dot)cc>
Cc: Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Is anybody actually using XLR_BKP_REMOVABLE?
Date: 2011-12-12 17:51:45
Message-ID: 21301.1323712305@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jesper Krogh <jesper(at)krogh(dot)cc> writes:
>> So: is there actually any such compression program out there?

> Perhaps http://pglesslog.projects.postgresql.org/

Hah ... the search facilities on pgfoundry really do leave something to
be desired :-(

So I guess we should try to preserve the functionality. I think the
move-the-flag-to-the-segment-header idea is probably the best.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jesper Krogh <jesper(at)krogh(dot)cc>, Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Is anybody actually using XLR_BKP_REMOVABLE?
Date: 2011-12-12 19:39:45
Message-ID: CA+U5nMLbcYk0_5rj0umU5hqK5sKsUc=8BK-ccg6H9x4CYiQ-hA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 12, 2011 at 5:51 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Jesper Krogh <jesper(at)krogh(dot)cc> writes:
>>> So: is there actually any such compression program out there?
>
>> Perhaps http://pglesslog.projects.postgresql.org/
>
> Hah ... the search facilities on pgfoundry really do leave something to
> be desired :-(
>
> So I guess we should try to preserve the functionality.  I think the
> move-the-flag-to-the-segment-header idea is probably the best.

Yes, that's the best idea.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Is anybody actually using XLR_BKP_REMOVABLE?
Date: 2011-12-12 21:25:05
Message-ID: 11879.1323725105@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On Mon, Dec 12, 2011 at 3:42 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> It occurs to me also that we could just move the flag from
>> per-WAL-record info bytes to per-page or even per-segment WAL headers.
>> Because we now force a segment switch when starting a backup, the
>> flag would be seen turned-on soon enough to prevent problems.
>> Finding out that it's off again after the end of a backup might be
>> a little delayed, but the only cost is failure to compress a few
>> compressible records.
>>
>> I'm not volunteering to do the above, unless someone steps forward
>> to say that there's active use of this flag, but either one of these
>> solutions seems more tenable than using up an info-byte bit.

> I'll volunteer. Assume you can reuse the flag and I will patch afterwards.

Thanks for the offer, but after thinking about it a bit more I realized
that this change is quite trivial, so I just went ahead and did it along
with the change in XLR_MAX_BKP_BLOCKS. This seems better since both
related changes are in one commit, and we can't forget to do it.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Is anybody actually using XLR_BKP_REMOVABLE?
Date: 2011-12-13 00:27:39
Message-ID: 14874.1323736059@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>> I'll volunteer. Assume you can reuse the flag and I will patch afterwards.

> Thanks for the offer, but after thinking about it a bit more I realized
> that this change is quite trivial, so I just went ahead and did it along
> with the change in XLR_MAX_BKP_BLOCKS. This seems better since both
> related changes are in one commit, and we can't forget to do it.

BTW, just for the archives' sake: some digging in the git history showed
that my memory was faulty about the pre-2007 limit of XLR_MAX_BKP_BLOCKS
having been 4. It was originally 2, and then in 2003 we increased it to
3 (cf commit 799bc58dc7ed9899facfc8302040749cb0a9af2f). So at the time
the last xl_info bit got taken over for XLR_BKP_REMOVABLE, it had in
fact been unused, and that probably explains why we didn't think harder
about whether there would be a less expensive way to do it.

I still think allowing 4 pages per WAL entry is a good thing, though,
and so am not inclined to withdraw the proposal. But perhaps it would
be worth explaining why this is necessary for SP-GiST. The case where
it comes up is trying to split a list of leaf-page tuples when we need
to add another entry to the list but there's no room on the page.
SP-GiST doesn't allow such lists to cross pages (which I think is a
reasonable restriction, both to avoid excess seeks and because the list
links can thereby be 2 bytes not 6). So what it has to do here is
insert an upper-page tuple ("inner tuple" in the patch's jargon) to
describe the set of leaf page tuples that have now been split into two
or more lists. This requires touching:
1. The leaf page currently holding the list to be modified.
2. Another leaf page that has enough free space for the overrun.
3. An inner page (inner pages and leaf pages are disjoint in SP-GiST)
where there's enough room to put the new inner tuple.
4. The inner page holding the inner tuple that is the parent of the
leaf-page list; we have to update its downlink to point to the
new inner tuple instead of the leaf list.

The code will try to put the new inner tuple on the same page as the
original parent tuple, but if there's no room there, there's no way to
get around the fact that there are four different pages involved here.

If somebody held a gun to my head and said "do it with only three",
what I'd try to do is make the update of the parent tuple into a
separately logged WAL action. However this is not trivial, or at least
it's not trivial to recover during replay if the database crashes after
logging the first action --- there is not enough information on-disk to
figure out what needs to be done. Also, I believe we've been trying
to get rid of that sort of recovery-time cleanup requirement, because
of hot standby. So on the whole I think extending the XLogInsert
machinery to allow 4 backed-up pages is the best solution.

regards, tom lane