Re: [REVIEW] Re: Compression of full-page-writes

From: Rahila Syed <rahilasyed90(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [REVIEW] Re: Compression of full-page-writes
Date: 2014-12-18 10:31:50
Message-ID: CAH2L28tvW6VEB9tfQfWHoDqF21TsczSA7R2gX9U=0wk3k+9dQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>Isn't it better to allocate the memory for compression_scratch in
>InitXLogInsert()
>like hdr_scratch?

I think making compression_scratch a statically allocated global variable
is the result of following discussion earlier,

http://www.postgresql.org/message-id/CA+TgmoazNBuwnLS4bpwyqgqteEznOAvy7KWdBm0A2-tBARn_aQ@mail.gmail.com

Thank you,
Rahila Syed

On Thu, Dec 18, 2014 at 1:57 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>
> On Thu, Dec 18, 2014 at 2:21 PM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
> >
> >
> > On Wed, Dec 17, 2014 at 11:33 PM, Rahila Syed <rahilasyed90(at)gmail(dot)com>
> > wrote:
> >>
> >> I had a look at code. I have few minor points,
> >
> > Thanks!
> >
> >> + bkpb.fork_flags |= BKPBLOCK_HAS_IMAGE;
> >> +
> >> + if (is_compressed)
> >> {
> >> - rdt_datas_last->data = page;
> >> - rdt_datas_last->len = BLCKSZ;
> >> + /* compressed block information */
> >> + bimg.length = compress_len;
> >> + bimg.extra_data = hole_offset;
> >> + bimg.extra_data |= XLR_BLCK_COMPRESSED_MASK;
> >>
> >> For consistency with the existing code , how about renaming the macro
> >> XLR_BLCK_COMPRESSED_MASK as BKPBLOCK_HAS_COMPRESSED_IMAGE on the lines
> of
> >> BKPBLOCK_HAS_IMAGE.
> >
> > OK, why not...
> >
> >>
> >> + blk->hole_offset = extra_data &
> ~XLR_BLCK_COMPRESSED_MASK;
> >> Here , I think that having the mask as BKPBLOCK_HOLE_OFFSET_MASK will be
> >> more indicative of the fact that lower 15 bits of extra_data field
> comprises
> >> of hole_offset value. This suggestion is also just to achieve
> consistency
> >> with the existing BKPBLOCK_FORK_MASK for fork_flags field.
> >
> > Yeah that seems clearer, let's define it as ~XLR_BLCK_COMPRESSED_MASK
> > though.
> >
> >> And comment typo
> >> + * First try to compress block, filling in the page hole
> with
> >> zeros
> >> + * to improve the compression of the whole. If the block is
> >> considered
> >> + * as incompressible, complete the block header information
> as
> >> if
> >> + * nothing happened.
> >>
> >> As hole is no longer being compressed, this needs to be changed.
> >
> > Fixed. As well as an additional comment block down.
> >
> > A couple of things noticed on the fly:
> > - Fixed pg_xlogdump being not completely correct to report the FPW
> > information
> > - A couple of typos and malformed sentences fixed
> > - Added an assertion to check that the hole offset value does not the bit
> > used for compression status
> > - Reworked docs, mentioning as well that wal_compression is off by
> default.
> > - Removed stuff in pg_controldata and XLOG_PARAMETER_CHANGE (mentioned by
> > Fujii-san)
>
> Thanks!
>
> + else
> + memcpy(compression_scratch, page, page_len);
>
> I don't think the block image needs to be copied to scratch buffer here.
> We can try to compress the "page" directly.
>
> +#include "utils/pg_lzcompress.h"
> #include "utils/memutils.h"
>
> pg_lzcompress.h should be after meutils.h.
>
> +/* Scratch buffer used to store block image to-be-compressed */
> +static char compression_scratch[PGLZ_MAX_BLCKSZ];
>
> Isn't it better to allocate the memory for compression_scratch in
> InitXLogInsert()
> like hdr_scratch?
>
> + uncompressed_page = (char *) palloc(PGLZ_RAW_SIZE(header));
>
> Why don't we allocate the buffer for uncompressed page only once and
> keep reusing it like XLogReaderState->readBuf? The size of uncompressed
> page is at most BLCKSZ, so we can allocate the memory for it even before
> knowing the real size of each block image.
>
> - printf(" (FPW); hole: offset: %u, length: %u\n",
> - record->blocks[block_id].hole_offset,
> - record->blocks[block_id].hole_length);
> + if (record->blocks[block_id].is_compressed)
> + printf(" (FPW); hole offset: %u, compressed length
> %u\n",
> + record->blocks[block_id].hole_offset,
> + record->blocks[block_id].bkp_len);
> + else
> + printf(" (FPW); hole offset: %u, length: %u\n",
> + record->blocks[block_id].hole_offset,
> + record->blocks[block_id].bkp_len);
>
> We need to consider what info about FPW we want pg_xlogdump to report.
> I'd like to calculate how much bytes FPW was compressed, from the report
> of pg_xlogdump. So I'd like to see also the both length of uncompressed FPW
> and that of compressed one in the report.
>
> In pg_config.h, the comment of BLCKSZ needs to be updated? Because
> the maximum size of BLCKSZ can be affected by not only itemid but also
> XLogRecordBlockImageHeader.
>
> bool has_image;
> + bool is_compressed;
>
> Doesn't ResetDecoder need to reset is_compressed?
>
> +#wal_compression = off # enable compression of full-page writes
>
> Currently wal_compression compresses only FPW, so isn't it better to place
> it after full_page_writes in postgresql.conf.sample?
>
> + uint16 extra_data; /* used to store offset of bytes in
> "hole", with
> + * last free bit used to check if block is
> + * compressed */
>
> At least to me, defining something like the following seems more easy to
> read.
>
> uint16 hole_offset:15,
> is_compressed:1
>
> Regards,
>
> --
> Fujii Masao
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2014-12-18 10:32:58 Re: Streaming replication and WAL archive interactions
Previous Message Torsten Zuehlsdorff 2014-12-18 10:02:51 Re: Commitfest problems