Re: Switching timeline over streaming replication

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Thom Brown <thom(at)linux(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Switching timeline over streaming replication
Date: 2012-12-20 12:50:10
Message-ID: 20121220125010.GC4303@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2012-12-20 14:45:05 +0200, Heikki Linnakangas wrote:
> On 17.12.2012 15:05, Thom Brown wrote:
> >I just set up 120 chained standbys, and for some reason I'm seeing these
> >errors:
> >
> >LOG: replication terminated by primary server
> >DETAIL: End of WAL reached on timeline 1
> >LOG: record with zero length at 0/301EC10
> >LOG: fetching timeline history file for timeline 2 from primary server
> >LOG: restarted WAL streaming at 0/3000000 on timeline 1
> >LOG: replication terminated by primary server
> >DETAIL: End of WAL reached on timeline 1
> >LOG: new target timeline is 2
> >LOG: restarted WAL streaming at 0/3000000 on timeline 2
> >LOG: replication terminated by primary server
> >DETAIL: End of WAL reached on timeline 2
> >FATAL: error reading result of streaming command: ERROR: requested WAL
> >segment 000000020000000000000003 has already been removed
> >
> >ERROR: requested WAL segment 000000020000000000000003 has already been
> >removed
> >LOG: started streaming WAL from primary at 0/3000000 on timeline 2
> >ERROR: requested WAL segment 000000020000000000000003 has already been
> >removed
>
> I just committed a patch that should make the "requested WAL segment
> 000000020000000000000003 has already been removed" errors go away. The trick
> was for walsenders to not switch to the new timeline until at least one
> record has been replayed on it. That closes the window where the walsender
> already considers the new timeline to be the latest, but the WAL file has
> not been created yet.

I vote for introducing InvalidTimeLineID soon... 0 as a invalid
TimeLineID seems to spread and is annoying to grep for.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-12-20 13:00:47 Re: Parser Cruft in gram.y
Previous Message Brett Maton 2012-12-20 12:47:21 Re: pg_top