Re: Performance Improvement by reducing WAL for Update Operation

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com>
Subject: Re: Performance Improvement by reducing WAL for Update Operation
Date: 2014-02-12 14:49:18
Message-ID: 20140212144918.GB12551@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 12, 2014 at 10:02:32AM +0530, Amit Kapila wrote:
> By issue, I assume you mean to say, which compression algorithm is
> best for this patch.
> For this patch, currently we have 2 algorithm's for which results have been
> posted. As far as I understand Heikki is pretty sure that the latest algorithm
> (compression using prefix-suffix match in old and new tuple) used for this
> patch is better than the other algorithm in terms of CPU gain or overhead.
> The performance data taken by me for the worst case for this algorithm
> shows there is a CPU overhead for this algorithm as well.
>
> OTOH the another algorithm (compression using old tuple as history) can be
> a bigger win in terms I/O reduction in more number of cases.
>
> In short, it is still not decided which algorithm to choose and whether
> it can be enabled by default or it is better to have table level switch
> to enable/disable it.
>
> So I think the decision to be taken here is about below points:
> 1. Are we okay with I/O reduction at the expense of CPU for *worst* cases
> and I/O reduction without impacting CPU (better overall tps) for
> *favourable* cases?
> 2. If we are not okay with worst case behaviour, then can we provide
> a table-level switch, so that it can be decided by user?
> 3. If none of above, then is there any other way to mitigate the worst
> case behaviour or shall we just reject this patch and move on.
>
> Given a choice to me, I would like to go with option-2, because I think
> for most cases UPDATE statement will have same data for old and
> new tuples except for some part of tuple (generally column's having large
> text data are not modified), so we will be end up mostly in favourable cases
> and surely for worst cases we don't want user to suffer from CPU overhead,
> so a table-level switch is also required.

I think 99.9% of users are never going to adjust this so we had better
choose something we are happy to enable for effectively everyone. In my
reading, prefix/suffix seemed safe for everyone. We can always revisit
this if we think of something better later, as WAL format changes are not
a problem for pg_upgrade.

I also think making it user-tunable is so hard for users to know when to
adjust as to be almost not worth the user interface complexity it adds.

I suggest we go with always-on prefix/suffix mode, then add some check
so the worst case is avoided by just giving up on compression.

As I said previously, I think compressing the page images is the next
big win in this area.

> I think here one might argue that for some users it is not feasible to
> decide whether their tuples data for UPDATE is going to be similar
> or completely different and they are not at all ready for any risk for
> CPU overhead, but they would be happy to see I/O reduction in which
> case it is difficult to decide what should be the value of table-level
> switch. Here I think the only answer is "nothing is free" in this world,
> so either make sure about the application's behaviour for UPDATE
> statement before going to production or just don't enable this switch and
> be happy with the current behaviour.

Again, can't set do a minimal attempt at prefix/suffix compression so
there is no measurable overhead?

> On the other side there will be users who will be pretty certain about their
> usage of UPDATE statement or atleast are ready to evaluate their
> application if they can get such a huge gain, so it would be quite useful
> feature for such users.
>
> >can we move move forward with the full-page compression patch?
>
> In my opinion, it is not certain that whatever compression algorithm got
> decided for this patch (if any) can be directly used for full-page
> compression, some ideas could be used or may be the algorithm could be
> tweaked a bit to make it usable for full-page compression.

Thanks, I understand that now.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stefan Seifert 2014-02-12 14:54:31 Docs incorrectly claiming equivalence between show and pg_settings
Previous Message Greg Stark 2014-02-12 14:37:59 Re: Recovery inconsistencies, standby much larger than primary