Re: Performance Improvement by reducing WAL for Update Operation

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: amit(dot)kapila(at)huawei(dot)com
Cc: hlinnakangas(at)vmware(dot)com, noah(at)leadboat(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Performance Improvement by reducing WAL for Update Operation
Date: 2012-12-10 09:11:05
Message-ID: 20121210.181105.256337385.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you.

> >heap_attr_get_length_and_check_equals:
..
> >- This function returns always false for attrnum <= 0 as whole
> > tuple or some system attrs comparison regardless of the real
> > result, which is a bit different from the anticipation which
> > the name gives. If you need to keep this optimization, it
> > should have the name more specific to the purpose.
>
> The heap_attr_get_length_and_check_equals function is similar to heap_tuple_attr_equals,
> the attrnum <= 0 check is required for heap_tuple_attr_equals.

Sorry, you're right.

> >haap_delta_encode:
> >
> >- Some misleading variable names (like match_not_found),
> > some reatitions of similiar codelets (att_align_pointer, pglz_out_tag),
> > misleading slight difference of the meanings of variables of
> > similar names(old_off and new_off and the similar pairs),
> > and bit tricky use of pglz_out_add and pglz_out_tag with length = 0.
> >
> > These are welcome to be modified for better readability.
>
> The variable names are modified, please check them once.
>
> The (att_align_pointer, pglz_out_tag) repetition code is added to take care of padding only incase of values are equal.
> Use of pglz_out_add and pglz_out_tag with length = 0 is done because of code readability.

Oops! Sorry for mistake. My point was that the bases for old_off
(of match_off) and dp, not new_off. It is no unnatural. Namings
had not been the problem and the function was perfect as of the
last patch. I'd been confised by the asymmetry between match_off
to pglz_out_tag and dp to pglz_out_add.

> Another change is also done to handle the history size of 2 bytes which is possible with the usage of LZ macro's for delta encoding.

Good catch. This seems to have been a potential bug which does no
harm when called from pglz_compress..

==========

Looking into wal_update_changes_mod_lz_v6.patch, I understand
that this patch experimentally adds literal data segment which
have more than single byte in PG-LZ algorithm. According to
pglz_find_match, memCMP is slower than 'while(*s && *s == *d)' if
len < 16 and I suppose it is probably true at least for 4 byte
length data. This is also applied on encoding side. If this mod
does no harm to performance, I want to see this applied also to
pglz_comress.

By the way, the comment on pg_lzcompress.c:690 seems to quite
differ from what the code does.

regards,

*1: http://archives.postgresql.org/message-id/6C0B27F7206C9E4CA54AE035729E9C38285495B0@szxeml509-mbx

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vik Reykja 2012-12-10 09:23:57 Re: proposal: fix corner use case of variadic fuctions usage
Previous Message Amit Kapila 2012-12-10 08:26:23 Re: [BUG?] lag of minRecoveryPont in archive recovery