Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Farina <daniel(at)heroku(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node
Date: 2012-06-19 14:45:42
Message-ID: 201206191645.42901.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, June 19, 2012 04:30:59 PM Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On Tuesday, June 19, 2012 04:17:01 PM Tom Lane wrote:
> >> ... (If you are thinking
> >> of something sufficiently high-level that merging could possibly work,
> >> then it's not WAL, and we shouldn't be trying to make the WAL
> >> representation cater for it.)
> >
> > The idea is that if youre replaying changes on node A originating from
> > node B you set the origin to *B* in the wal records that are generated
> > during that. So when B, in a bidirectional setup, replays the changes
> > that A has made it can simply ignore all changes which originated on
> > itself.
>
> This is most certainly not possible at the level of WAL.
Huh? This isn't used during normal crash-recovery replay. The information is
used when decoding the wal into logical changes and applying those. Its just a
common piece of information thats needed for a large number of records.

Alternatively it could be added to all the records that need it, but that
would smear the necessary logic - which is currently trivial - over more of
the backend. And it would increase the actual size of wal which this one did
not.

> As I said above, we shouldn't be trying to shoehorn high level logical-
> replication commands into WAL streams. No good can come of confusing those
> concepts.
Its not doing anything high-level in there? All that patch does is embedding
one single piece of information in previously unused space.

I can follow the argument that you do not want *any* logical information in
the wal. But as I said in the patchset-introducing email: I don't really see
an alternative. Otherwise we would just duplicate all the locking/scalability
issues of xlog as well as the amount of writes.
This is, besides logging some more informations when wal_level = logical in
some particular records (HEAP_UPDATE|DELETE and ensuring fpw's don't remove
the record data in HEAP_(INSERT|UPDATE|DELETE) in patch 07/16 the only change
that I really forsee being needed for doing the logical stuff.

Do you really see this as such a big problem?

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2012-06-19 14:51:14 Re: Do we want a xmalloc or similar function in the Backend?
Previous Message Peter Geoghegan 2012-06-19 14:41:17 Re: sortsupport for text