Re: Fast promotion failure

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: amit(dot)kapila(at)huawei(dot)com
Cc: masao(dot)fujii(at)gmail(dot)com, hlinnakangas(at)vmware(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Fast promotion failure
Date: 2013-05-09 08:44:08
Message-ID: 20130509.174408.198415395.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

With printing some additinal logs, the situation should be more
clear..

It seems that Sby-B failes to promote to TLI= 2; nevertheless the
history file for TLI = 2 is somehow sent to sby-C. So sby-B
remains on TLI=1 but sby-C solely switches onto TLI=2.

# Come to think of this, I suspect that the additional logs is not so useful :(

> B 2013-05-09 17:29:53.380 JST 32258 ERROR: server switched off timeline 1 at 0/53F8B60, but walsender already streamed up to 0/53FA000
> C 2013-05-09 17:29:53.380 JST 32257 FATAL: could not receive data from WAL stream: ERROR: server switched off timeline 1 at 0/53F8B60, but walsender already streamed up to 0/53FA000
>
> B 2013-05-09 17:29:53.380 JST 32244 LOG: database system is ready to accept connections
..
> C 2013-05-09 17:30:08.395 JST 32256 LOG: Reading page on Timeline ID = 1
> C 2013-05-09 17:30:08.398 JST 32274 LOG: fetching timeline history file for timeline 2 from primary server
> C 2013-05-09 17:30:08.448 JST 32274 LOG: started streaming WAL from primary at 0/5000000 on timeline 1
> C 2013-05-09 17:30:08.452 JST 32274 LOG: replication terminated by primary server
> C 2013-05-09 17:30:08.452 JST 32274 DETAIL: End of WAL reached on timeline 1 at 0/53F8B60
> C 2013-05-09 17:30:08.452 JST 32256 LOG: new target timeline is 2
> C 2013-05-09 17:30:08.452 JST 32256 LOG: Reading page on Timeline ID = 1
> C 2013-05-09 17:30:08.452 JST 32256 LOG: Reading page on Timeline ID = 1
> C 2013-05-09 17:30:08.453 JST 32274 LOG: restarted WAL streaming at 0/5000000 on timeline 2
> B 2013-05-09 17:30:10.913 JST 32248 LOG: This checkpoint record is on TimelineID = 1, loc is about 0/53F8C30
> B 2013-05-09 17:30:10.953 JST 32248 LOG: checkpoint complete: wrote 637 buffers (3.9%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=13.502 s, sync=0.105 s, total=13.733 s; sync files=2, longest=0.089 s, average=0.052 s
> B 2013-05-09 17:30:10.953 JST 32248 LOG: checkpoint starting: immediate force wait
> B 2013-05-09 17:30:10.963 JST 32248 LOG: This checkpoint record is on TimelineID = 1, loc is about 0/53F8CD0
> B 2013-05-09 17:30:11.003 JST 32248 LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.000 s, sync=0.000 s, total=0.049 s; sync files=0, longest=0.000 s, average=0.000 s
> B 2013-05-09 17:30:11.096 JST 32248 LOG: checkpoint starting: immediate force wait
> B 2013-05-09 17:30:11.909 JST 32248 LOG: This checkpoint record is on TimelineID = 1, loc is about 0/540BEF8
> C 2013-05-09 17:30:11.929 JST 32256 LOG: invalid magic number 0000 in log segment 000000010000000000000005, offset 4169728
> C 2013-05-09 17:30:11.929 JST 32274 FATAL: terminating walreceiver process due to administrator command
> B 2013-05-09 17:30:11.951 JST 32248 LOG: checkpoint complete: wrote 18 buffers (0.1%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.017 s, sync=0.785 s, total=0.855 s; sync files=13, longest=0.235 s, average=0.060 s
> CHECKPOINT
> C 2013-05-09 17:30:13.931 JST 32256 LOG: Reading page on Timeline ID = 2
> C 2013-05-09 17:30:13.931 JST 32256 LOG: record with zero length at 0/53F8B90
> C 2013-05-09 17:30:13.931 JST 32256 LOG: Reading page on Timeline ID = 2
> C 2013-05-09 17:30:13.931 JST 32256 LOG: record with zero length at 0/53F8B90
> C 2013-05-09 17:30:18.936 JST 32256 LOG: Reading page on Timeline ID = 2
> C 2013-05-09 17:30:18.936 JST 32256 LOG: Reading page on Timeline ID = 2
> C 2013-05-09 17:30:18.936 JST 32256 LOG: record with zero length at 0/53F8B90

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2013-05-09 11:36:52 Re: Proposal to add --single-row to psql
Previous Message Fabien COELHO 2013-05-09 08:29:19 Re: Add regression tests for ROLE (USER)