Re: Changeset Extraction v7.9.1

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Changeset Extraction v7.9.1
Date: 2014-03-17 12:00:22
Message-ID: CA+TgmobbGx2sVG6_Mm8z4q-6moY0Mk-hCdVY7rtT6O7K7Zz5hQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 17, 2014 at 7:27 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> - There doesn't seem to be any provision for this tool to ever switch
>> from one output file to the next. That seems like a practical need.
>> One idea would be to have it respond to SIGHUP by reopening the
>> originally-named output file. Another would be to switch, after so
>> many bytes, to filename.1, then filename.2, etc.
>
> Hm. So far I haven't had the need, but you're right, it would be
> useful. I don't like the .<n> notion, but SIGHUP would be fine with
> me. I'll add that.

Cool.

>> - It confirms the write and flush positions, but doesn't appear to
>> actually flush anywhere.
>
> Yea. The reason it reports the flush position is that it allows to test
> sync rep. I don't think other usecases will appreciate frequent
> fsyncs... Maybe make it optional?

Well, as I'm sure you recognize, if you're actually trying to build a
replication solution with this tool, you can't let the database throw
away the state required to suck changes out of the database unless
you've got those changes safely stored away somewhere else. Now, of
course, if you don't acknowledge to the database that the stuff is on
disk, you're going to get data file bloat and excess WAL retention,
unlucky you. But acknowledging that you've got the changes when
they're not actually on disk doesn't actually provide the guarantees
you went to so much trouble to build in to the mechanism. So the
no-flush version really can ONLY ever be useful for testing, AFAICS,
or if you really don't care that much whether it can survive a server
crash.

Perhaps there could be a switch for an fsync interval, or something
like that. The default could be, say, to fsync every 10 seconds. And
if you want to change it, then go ahead; 0 disables. Writing to
standard output would be documented as unreliable. Other ideas
welcome.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-03-17 12:02:32 Re: pg_dump without explicit table locking
Previous Message Fujii Masao 2014-03-17 11:53:31 Re: Various typos