Re: logical changeset generation v3

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: logical changeset generation v3
Date: 2012-11-21 06:28:30
Message-ID: CAB7nPqSAUkU4GQta_R_ccuyNr8Q=KStcNqx4Y9J0dM=uQD7fEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 20, 2012 at 8:22 PM, Andres Freund <andres(at)2ndquadrant(dot)com>wrote:

> On 2012-11-20 09:30:40 +0900, Michael Paquier wrote:
> > On Mon, Nov 19, 2012 at 5:50 PM, Andres Freund <andres(at)2ndquadrant(dot)com
> >wrote:
> > > On 2012-11-19 16:28:55 +0900, Michael Paquier wrote:
> > > > I am just looking at this patch and will provide some comments.
> > > > By the way, you forgot the installation part of pg_receivellog,
> please see
> > > > patch attached.
> > >
> > > That actually was somewhat intended, I thought people wouldn't like the
> > > name and I didn't want a binary that's going to be replaced anyway
> lying
> > > around ;)
> > >
> > OK no problem. For sure this is going to happen, I was wondering myself
> if
> > it could be possible to merge pg_receivexlog and pg_receivellog into a
> > single utility with multiple modes :)
>
> Don't really see that, the differences already are significant and imo
> are bound to get bigger. Shouldn't live in pg_basebackup/ either...
>
I am sure that this will be the object of many future discussions.

> > Btw, here are some extra comments based on my progress, hope it will be
> > useful for other people playing around with your patches.
> > 1) Necessary to install the contrib module test_decoding on server side
> or
> > the test case will not work.
> > 2) Obtention of the following logs on server:
> > LOG: forced to assume catalog changes for xid 1370 because it was
> running
> > to early
> > WARNING: ABORT 1370
> > Actually I saw that there are many warnings like this.
>
> Those aren't unexpected. Perhaps I should not make it a warning then...
>
A NOTICE would be more adapted, a WARNING means that something that may
endanger the system has happened, but as far as I understand from your
explanation this is not the case.

> A short explanation:
>
> We can only decode tuples we see in the WAL when we already have a
> timetravel catalog snapshot before that transaction started. To build
> such a snapshot we need to collect information about committed which
> changed the catalog. Unfortunately we can't diagnose whether a txn
> changed the catalog without a snapshot so we just assume all committed
> ones do - it just costs a bit of memory. Thats the background of the
> "forced to assume catalog changes for ..." message.
>
OK, so this snapshot only needs to include the XIDs of transactions that
have modified the catalogs. Do I get it right? This way you are able to
fetch the correct relation definition for replication decoding.

Just thinking but... It looks to be a waste to store the transactions XIDs
of all the committed transactions, but on the other hand there is no way to
track the XIDs of transactions that modified a catalog in current core
code. So yes this approach is better as refining the transaction XID
tracking for snapshot reconstruction is something that could be improved
later. Those are only thoughts though...

The reason for the ABORTs is related but different. We start out in the
> "SNAPBUILD_START" state when we try to build a snapshot. When we find
> initial information about running transactions (i.e. xl_running_xacts)
> we switch to the "SNAPBUILD_FULL_SNAPSHOT" state which means we can
> decode all changes in transactions that start *after* the current
> lsn. Earlier transactions might have tuples on a catalog state we can't
> query.
>
Just to be clear, lsn means the log-sequence number associated to each xlog
record?

> Only when all transactions we observed as running before the
> FULL_SNAPSHOT state have finished we switch to SNAPBUILD_CONSISTENT.
> As we want a consistent/reproducible set of transactions to produce
> output via the logstream we only pass transactions to the output plugin
> if they commit *after* CONSISTENT (they can start earlier though!). This
> allows us to produce a pg_dump compatible snapshot in the moment we get
> consistent that contains exactly the changes we won't stream out.
>
> Makes sense?
>
OK got it thanks for your explanation.

So, once again coming to it, we need in the snapshot built only the XIDs of
transactions that modified the catalogs to get a consistent view of
relation info for decoding.
Really, I think that refining the XID tracking to minimize the size of the
snapshot built for decoding would be really a key for performance
improvement especially for OLTP-type applications (lots of transactions
involved, few of them involving catalogs).
--
Michael Paquier
http://michael.otacoo.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2012-11-21 07:30:00 Re: logical changeset generation v3
Previous Message Amit Kapila 2012-11-21 06:10:17 Re: StrategyGetBuffer questions