Re: Freeze avoidance of very large table.

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: Freeze avoidance of very large table.
Date: 2015-04-23 15:39:41
Message-ID: 5539123D.80805@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 23/04/15 17:24, Heikki Linnakangas wrote:
> On 04/23/2015 05:52 PM, Jim Nasby wrote:
>> On 4/23/15 2:42 AM, Heikki Linnakangas wrote:
>>> On 04/22/2015 09:24 PM, Robert Haas wrote:
>>>> Yeah. We have a serious need to reduce the size of our on-disk
>>>> format. On a TPC-C-like workload Jan Wieck recently tested, our data
>>>> set was 34% larger than another database at the beginning of the test,
>>>> and 80% larger by the end of the test. And we did twice the disk
>>>> writes. See "The Elephants in the Room.pdf" at
>>>> https://sites.google.com/site/robertmhaas/presentations
>>>
>>> Meh. Adding an 8-byte header to every 8k block would add 0.1% to the
>>> disk size. No doubt it would be nice to reduce our disk footprint, but
>>> the page header is not the elephant in the room.
>>
>> I've often wondered if there was some way we could consolidate XMIN/XMAX
>> from multiple tuples at the page level; that could be a big win for OLAP
>> environments where most of your tuples belong to a pretty small range of
>> XIDs. In many workloads you could have 80%+ of the tuples in a table
>> having a single inserting XID.
>
> It would be doable for xmin - IIRC someone even posted a patch for that
> years ago - but xmax (and ctid) is difficult. When a tuple is inserted,
> Xmax is basically just a reservation for the value that will be put
> there later. You have no idea what that value is, and you can't
> influence it, and when it's time to delete/update the row, you *must*
> have the space for that xmax. So we can't opportunistically use the
> space for anything else, or compress them or anything like that.
>

That depends, if we are going to change page format we can move the xmax
to be some map of ctid->xmax in the header (with no values for tuples
with no xmax) or have bitmap there of tuples that have xmax etc.
Basically not saving xmax (and potentially other info) inline for each
tuple but have some info in header only for tuples that need it. That
might have bad performance side effects of course, but there are
definitely some potential ways of doing things differently which we
could explore.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2015-04-23 15:41:25 Re: Freeze avoidance of very large table.
Previous Message Bruce Momjian 2015-04-23 15:38:57 Re: Freeze avoidance of very large table.