WAL segments (names) not in a sequence

Lists: pgsql-hackers
From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, German Becker <german(dot)becker(at)gmail(dot)com>
Subject: WAL segments (names) not in a sequence
Date: 2013-05-23 01:10:46
Message-ID: CA+HiwqHi7Gz44ZM=RW7ih-4WTrrfSEMXsWN55zQ9bsmHfu73xA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

A PostgreSQL user recently reported on pgsql-admin about an issue:
when he changed wal_level from 'minimal' to 'hot_standby', the WAL
segment sequence rewound, that is, it started using old names. A
snippet of his "ls -lrt pg_xlog":

-rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000B3
-rw------- 1 postgres postgres 16777216 May 21 12:13 00000001000000100000000A
-rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000001000000036
-rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000001000000008
-rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000F6
-rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000E4
-rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000F2
-rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000001000000003
-rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000D5
-rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000A9
-rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000D6
-rw------- 1 postgres postgres 16777216 May 21 12:13 00000001000000100000004E
-rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000D8
-rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000B4
-rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000BA
-rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000C3
-rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000001000000071
-rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000001000000047
-rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000001000000026

# WAL after wal_level changed from 'minimal' to 'hot_standby'

-rw------- 1 postgres postgres 16777216 May 21 12:27 000000010000000E0000007B
-rw------- 1 postgres postgres 16777216 May 21 12:35 000000010000000E0000007C
-rw------- 1 postgres postgres 276 May 21 12:35
000000010000000E0000007C.00000020.backup
-rw------- 1 postgres postgres 16777216 May 21 14:53 000000010000000E0000007D
-rw------- 1 postgres postgres 16777216 May 21 14:53 000000010000000E0000007E
-rw------- 1 postgres postgres 16777216 May 21 14:53 000000010000000E0000007F
-rw------- 1 postgres postgres 16777216 May 21 14:53 000000010000000E00000080

Is this a known issue?

--
Amit Langote


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, German Becker <german(dot)becker(at)gmail(dot)com>
Subject: Re: WAL segments (names) not in a sequence
Date: 2013-05-23 07:10:52
Message-ID: CAHGQGwHrzQfq_9PH7M18RxxG6EVfUTDmcStiZhdA+VL0EYDYdg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 23, 2013 at 10:10 AM, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> A PostgreSQL user recently reported on pgsql-admin about an issue:
> when he changed wal_level from 'minimal' to 'hot_standby', the WAL
> segment sequence rewound, that is, it started using old names. A
> snippet of his "ls -lrt pg_xlog":
>
> -rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000B3
> -rw------- 1 postgres postgres 16777216 May 21 12:13 00000001000000100000000A
> -rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000001000000036
> -rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000001000000008
> -rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000F6
> -rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000E4
> -rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000F2
> -rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000001000000003
> -rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000D5
> -rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000A9
> -rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000D6
> -rw------- 1 postgres postgres 16777216 May 21 12:13 00000001000000100000004E
> -rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000D8
> -rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000B4
> -rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000BA
> -rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000000F000000C3
> -rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000001000000071
> -rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000001000000047
> -rw------- 1 postgres postgres 16777216 May 21 12:13 000000010000001000000026

I think these are the WAL files that were preallocated by WAL
recycling but have not
been used yet.

> # WAL after wal_level changed from 'minimal' to 'hot_standby'
>
> -rw------- 1 postgres postgres 16777216 May 21 12:27 000000010000000E0000007B
> -rw------- 1 postgres postgres 16777216 May 21 12:35 000000010000000E0000007C
> -rw------- 1 postgres postgres 276 May 21 12:35
> 000000010000000E0000007C.00000020.backup
> -rw------- 1 postgres postgres 16777216 May 21 14:53 000000010000000E0000007D
> -rw------- 1 postgres postgres 16777216 May 21 14:53 000000010000000E0000007E
> -rw------- 1 postgres postgres 16777216 May 21 14:53 000000010000000E0000007F
> -rw------- 1 postgres postgres 16777216 May 21 14:53 000000010000000E00000080

These are the WAL files that you now used. So I don't think that WAL
file sequence rewound
in this case.

Regards,

--
Fujii Masao


From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, German Becker <german(dot)becker(at)gmail(dot)com>
Subject: Re: WAL segments (names) not in a sequence
Date: 2013-05-23 08:01:17
Message-ID: CA+HiwqEaW2Fn-n5GY9x5HTLfJfRAeDjmOaQ3vbcUWaxCPGBZ7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I think these are the WAL files that were preallocated by WAL
> recycling but have not
> been used yet.
>
>> # WAL after wal_level changed from 'minimal' to 'hot_standby'
>>
>> -rw------- 1 postgres postgres 16777216 May 21 12:27 000000010000000E0000007B
>> -rw------- 1 postgres postgres 16777216 May 21 12:35 000000010000000E0000007C
>> -rw------- 1 postgres postgres 276 May 21 12:35
>> 000000010000000E0000007C.00000020.backup
>> -rw------- 1 postgres postgres 16777216 May 21 14:53 000000010000000E0000007D
>> -rw------- 1 postgres postgres 16777216 May 21 14:53 000000010000000E0000007E
>> -rw------- 1 postgres postgres 16777216 May 21 14:53 000000010000000E0000007F
>> -rw------- 1 postgres postgres 16777216 May 21 14:53 000000010000000E00000080
>
> These are the WAL files that you now used. So I don't think that WAL
> file sequence rewound
> in this case.
>

Can pre-allocation go that further? for example, assuming
000000010000000E00000080 is currently being used, then is it possible
that a segment named/numbered 00000001000000100000007E (which does
exist in his pg_xlog as he reported in pgsql-admin thread) is
pre-allocated already?

I think we could ask the user the latest value of "select
pg_xlogfile_name(pg_xlog_current_location())".

--
Amit Langote


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, German Becker <german(dot)becker(at)gmail(dot)com>
Subject: Re: WAL segments (names) not in a sequence
Date: 2013-05-23 08:15:24
Message-ID: CAHGQGwExqB54uJcZELykrspq24YZr42LSETiZyFCFcJadA31Pw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 23, 2013 at 5:01 PM, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>> I think these are the WAL files that were preallocated by WAL
>> recycling but have not
>> been used yet.
>>
>>> # WAL after wal_level changed from 'minimal' to 'hot_standby'
>>>
>>> -rw------- 1 postgres postgres 16777216 May 21 12:27 000000010000000E0000007B
>>> -rw------- 1 postgres postgres 16777216 May 21 12:35 000000010000000E0000007C
>>> -rw------- 1 postgres postgres 276 May 21 12:35
>>> 000000010000000E0000007C.00000020.backup
>>> -rw------- 1 postgres postgres 16777216 May 21 14:53 000000010000000E0000007D
>>> -rw------- 1 postgres postgres 16777216 May 21 14:53 000000010000000E0000007E
>>> -rw------- 1 postgres postgres 16777216 May 21 14:53 000000010000000E0000007F
>>> -rw------- 1 postgres postgres 16777216 May 21 14:53 000000010000000E00000080
>>
>> These are the WAL files that you now used. So I don't think that WAL
>> file sequence rewound
>> in this case.
>>
>
> Can pre-allocation go that further? for example, assuming
> 000000010000000E00000080 is currently being used, then is it possible
> that a segment named/numbered 00000001000000100000007E (which does
> exist in his pg_xlog as he reported in pgsql-admin thread) is
> pre-allocated already?

Yes, if it's so old that it's no longer required for the crash recovery.

WAL recycling is performed by checkpoint. Checkpoint always checks
whether there are
WAL files no longer required for crash recovery, IOW, WAL files which
were generated
before the prior checkpoint happened, and then if they are found,
checkpoint tries to recycle
them.

> I think we could ask the user the latest value of "select
> pg_xlogfile_name(pg_xlog_current_location())".

Yep.

Regards,

--
Fujii Masao


From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, German Becker <german(dot)becker(at)gmail(dot)com>
Subject: Re: WAL segments (names) not in a sequence
Date: 2013-05-23 08:25:58
Message-ID: CA+HiwqFVwX2Umq22fU2mywHkkzTQOqmtXQDDB-_v_iZS=cShMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>> Can pre-allocation go that further? for example, assuming
>> 000000010000000E00000080 is currently being used, then is it possible
>> that a segment named/numbered 00000001000000100000007E (which does
>> exist in his pg_xlog as he reported in pgsql-admin thread) is
>> pre-allocated already?
>
> Yes, if it's so old that it's no longer required for the crash recovery.
>
> WAL recycling is performed by checkpoint. Checkpoint always checks
> whether there are
> WAL files no longer required for crash recovery, IOW, WAL files which
> were generated
> before the prior checkpoint happened, and then if they are found,
> checkpoint tries to recycle
> them.
>

Okay, now I understand. Also, looking at his "ls -l pg_xlog", I could
find that modified timestamps of all those pre-allocated segments are
about similar (around 12:10), whereas the latest modified time (15:37)
is of segment 000000010000000E000000A7.

Wonder if whatever configuration he is using is sub-optimal that these
many WAL segments can be re-cycled upon a checkpoint? Or is this okay?

--
Amit Langote


From: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, German Becker <german(dot)becker(at)gmail(dot)com>
Subject: Re: WAL segments (names) not in a sequence
Date: 2013-05-23 08:29:17
Message-ID: CAL_0b1sir2G2MnzqAWDeNLR0FbU+SthHTvF_XiAqXTXXj9QFtw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 23, 2013 at 1:25 AM, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> Okay, now I understand. Also, looking at his "ls -l pg_xlog", I could
> find that modified timestamps of all those pre-allocated segments are
> about similar (around 12:10), whereas the latest modified time (15:37)
> is of segment 000000010000000E000000A7.
>
> Wonder if whatever configuration he is using is sub-optimal that these
> many WAL segments can be re-cycled upon a checkpoint? Or is this okay?

Is archive_mode=on?
What is archive_command?
Is the server in the recovery mode?

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

Profile: http://www.linkedin.com/in/grayhemp
Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988) 888-1979
Skype: gray-hemp
Jabber: gray(dot)ru(at)gmail(dot)com


From: German Becker <german(dot)becker(at)gmail(dot)com>
To: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL segments (names) not in a sequence
Date: 2013-05-23 13:18:53
Message-ID: CALyjCLvw-2z7SPApYj5ueXv_VSKJvMjvv=6AR+g6uLZQ8SNMGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 23, 2013 at 5:29 AM, Sergey Konoplev <gray(dot)ru(at)gmail(dot)com> wrote:

> On Thu, May 23, 2013 at 1:25 AM, Amit Langote <amitlangote09(at)gmail(dot)com>
> wrote:
> > Okay, now I understand. Also, looking at his "ls -l pg_xlog", I could
> > find that modified timestamps of all those pre-allocated segments are
> > about similar (around 12:10), whereas the latest modified time (15:37)
> > is of segment 000000010000000E000000A7.
> >
> > Wonder if whatever configuration he is using is sub-optimal that these
> > many WAL segments can be re-cycled upon a checkpoint? Or is this okay?
>
> Is archive_mode=on?
> What is archive_command?
> Is the server in the recovery mode?
>
> --
> Kind regards,
> Sergey Konoplev
> PostgreSQL Consultant and DBA
>
> Profile: http://www.linkedin.com/in/grayhemp
> Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988) 888-1979
> Skype: gray-hemp
> Jabber: gray(dot)ru(at)gmail(dot)com
>

Hi Sergey and all,
Let me describe the process I follow to get to this. What I am doing is
testing a migration from 8.3 to 9.1. They way I plan to do it is the
following.
1) Create the schema
2) import the biggest tables, which are not updated,only growing, with COPY
(this is about 35gb of data)
2)import the small, changing part of the data

The target system is 9.1 with streaming relication.
For steps 1 and 2, I set a "restore" configuration, that amongs other
things like more work mem, it sets archive_mode=off and wal_level=minimal
(attached the difference between restore and normal).
The archive_command is just a cp wrapped in a shell script in case I need
to change it.

Let me know if you need any more info

Attachment Content-Type Size
restorevsnormal.diff application/octet-stream 3.3 KB

From: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
To: German Becker <german(dot)becker(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL segments (names) not in a sequence
Date: 2013-05-23 22:44:12
Message-ID: CAL_0b1uNCiyYaWZt1CaiDoZJSHHGa48zN2kp3bx-fApY9tje=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 23, 2013 at 6:18 AM, German Becker <german(dot)becker(at)gmail(dot)com> wrote:
> Let me describe the process I follow to get to this. What I am doing is
> testing a migration from 8.3 to 9.1. They way I plan to do it is the
> following.
> 1) Create the schema
> 2) import the biggest tables, which are not updated,only growing, with COPY
> (this is about 35gb of data)
> 2)import the small, changing part of the data
>
>
> The target system is 9.1 with streaming relication.
> For steps 1 and 2, I set a "restore" configuration, that amongs other things
> like more work mem, it sets archive_mode=off and wal_level=minimal (attached
> the difference between restore and normal).
> The archive_command is just a cp wrapped in a shell script in case I need to
> change it.

You can not migrate between any major versions with WAL based or
streaming replication.

Use either full dump/restore or schema only dump/restore plus trigger
based replication (londiste, slony) to migrate data.

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

Profile: http://www.linkedin.com/in/grayhemp
Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988) 888-1979
Skype: gray-hemp
Jabber: gray(dot)ru(at)gmail(dot)com


From: German Becker <german(dot)becker(at)gmail(dot)com>
To: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL segments (names) not in a sequence
Date: 2013-05-24 12:16:02
Message-ID: CALyjCLswNyS4qjBxVNugQaGQvVxcAGOZrYdaNgscJ2HJJS15Qg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Sergey,

Maybe I didn't explain correctly. I am using COPY/pg_dump/pg_restore for
migration (and it is working fine). The streaming replication is for
hot-standby replication *once migrated*. Thing is I disbable archving and
set wal_level to minimal, when migrating the large portion of data, to make
it faster. Then I switch to wal_level=hot_standby, i.e the "production"
configuration, and the WAL segment seuqence seems to overlap with the
segments generated with the other setting.

On Thu, May 23, 2013 at 7:44 PM, Sergey Konoplev <gray(dot)ru(at)gmail(dot)com> wrote:

> On Thu, May 23, 2013 at 6:18 AM, German Becker <german(dot)becker(at)gmail(dot)com>
> wrote:
> > Let me describe the process I follow to get to this. What I am doing is
> > testing a migration from 8.3 to 9.1. They way I plan to do it is the
> > following.
> > 1) Create the schema
> > 2) import the biggest tables, which are not updated,only growing, with
> COPY
> > (this is about 35gb of data)
> > 2)import the small, changing part of the data
> >
> >
> > The target system is 9.1 with streaming relication.
> > For steps 1 and 2, I set a "restore" configuration, that amongs other
> things
> > like more work mem, it sets archive_mode=off and wal_level=minimal
> (attached
> > the difference between restore and normal).
> > The archive_command is just a cp wrapped in a shell script in case I
> need to
> > change it.
>
> You can not migrate between any major versions with WAL based or
> streaming replication.
>
> Use either full dump/restore or schema only dump/restore plus trigger
> based replication (londiste, slony) to migrate data.
>
> --
> Kind regards,
> Sergey Konoplev
> PostgreSQL Consultant and DBA
>
> Profile: http://www.linkedin.com/in/grayhemp
> Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988) 888-1979
> Skype: gray-hemp
> Jabber: gray(dot)ru(at)gmail(dot)com
>


From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: German Becker <german(dot)becker(at)gmail(dot)com>
Cc: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL segments (names) not in a sequence
Date: 2013-05-24 13:01:31
Message-ID: CA+HiwqFFeckomHyN8+q=w+TfnkogBEPLk7DY-ek_m39U=MDvFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Maybe I didn't explain correctly. I am using COPY/pg_dump/pg_restore for
> migration (and it is working fine). The streaming replication is for
> hot-standby replication *once migrated*. Thing is I disbable archving and
> set wal_level to minimal, when migrating the large portion of data, to make
> it faster. Then I switch to wal_level=hot_standby, i.e the "production"
> configuration, and the WAL segment seuqence seems to overlap with the
> segments generated with the other setting.
>

Though, now you understand it's not what it looks like, right? :-)

--
Amit Langote


From: German Becker <german(dot)becker(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL segments (names) not in a sequence
Date: 2013-05-24 15:28:19
Message-ID: CALyjCLugLBkgGm2D=cZJSs+wo4_cPOYjDcrRiChBSYmPv3+K0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, May 24, 2013 at 10:01 AM, Amit Langote <amitlangote09(at)gmail(dot)com>wrote:

> > Maybe I didn't explain correctly. I am using COPY/pg_dump/pg_restore for
> > migration (and it is working fine). The streaming replication is for
> > hot-standby replication *once migrated*. Thing is I disbable archving and
> > set wal_level to minimal, when migrating the large portion of data, to
> make
> > it faster. Then I switch to wal_level=hot_standby, i.e the "production"
> > configuration, and the WAL segment seuqence seems to overlap with the
> > segments generated with the other setting.
> >
>
> Though, now you understand it's not what it looks like, right? :-)
>
>
> --
> Amit Langote
>

I didn't quite understand what you mean by that... But anyways so do you
people think this sequence number overlap is "normal" ?


From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: German Becker <german(dot)becker(at)gmail(dot)com>
Cc: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL segments (names) not in a sequence
Date: 2013-05-24 15:46:39
Message-ID: CA+HiwqGVmO-DJGwj0zJRnnNhpxYJsmDgoLB59QGXckhq8Y0+oQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I didn't quite understand what you mean by that... But anyways so do you
> people think this sequence number overlap is "normal" ?

There is "no overlap" at all. The newer segments that you see are
"pre-allocated" ones. They have not been written to yet.

From the "ls -l pg_xlog" output that you sent, it can be seen that
segments starting from 000000010000000E000000A8 through
00000001000000100000007E have been pre-allocated (at that point of
time) and 000000010000000E000000A7 is currently being written to. Just
look at the modified times in your "ls -l" listing.
000000010000000E000000A7 has May 22 15:32 (the latest writes seem to
have happened to this segment) whereas pre-allocated ones seem to have
around May 22 12:05 to 12:15 (which are yet to be written to).

Does that help?

--
Amit Langote


From: German Becker <german(dot)becker(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL segments (names) not in a sequence
Date: 2013-05-24 18:08:41
Message-ID: CALyjCLsAvEHrOTNtxTj6t_tRZFUbChHiwdVctH5mRja7LYny-w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thanks Amit, I understand now. Is there a way to know/predict how many
prealocated segments will there be in a certain moment? What does it
deppend on?

On Fri, May 24, 2013 at 12:46 PM, Amit Langote <amitlangote09(at)gmail(dot)com>wrote:

> > I didn't quite understand what you mean by that... But anyways so do you
> > people think this sequence number overlap is "normal" ?
>
> There is "no overlap" at all. The newer segments that you see are
> "pre-allocated" ones. They have not been written to yet.
>
> From the "ls -l pg_xlog" output that you sent, it can be seen that
> segments starting from 000000010000000E000000A8 through
> 00000001000000100000007E have been pre-allocated (at that point of
> time) and 000000010000000E000000A7 is currently being written to. Just
> look at the modified times in your "ls -l" listing.
> 000000010000000E000000A7 has May 22 15:32 (the latest writes seem to
> have happened to this segment) whereas pre-allocated ones seem to have
> around May 22 12:05 to 12:15 (which are yet to be written to).
>
> Does that help?
>
> --
> Amit Langote
>


From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: German Becker <german(dot)becker(at)gmail(dot)com>
Cc: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL segments (names) not in a sequence
Date: 2013-05-24 18:24:17
Message-ID: CA+HiwqHc+FmKFyhf3CNh4LMmCyj6zyn4SiVs7i=Pd+scsd3rQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, May 25, 2013 at 3:08 AM, German Becker <german(dot)becker(at)gmail(dot)com> wrote:
> Thanks Amit, I understand now. Is there a way to know/predict how many
> prealocated segments will there be in a certain moment? What does it deppend
> on?

Upthread, Fujii Masao-san suggested what might have happened that
caused these pre-allocated segments to be created. To quote him:

"WAL recycling is performed by checkpoint. Checkpoint always checks
whether there are WAL files no longer required for crash recovery,
IOW, WAL files which were generated before the prior checkpoint
happened, and then if they are found, checkpoint tries to recycle
them."

Reading here would also help:
http://www.postgresql.org/docs/9.1/static/wal-configuration.html

If you are still using the same values as during this observation,
could you provide values for these postgresql.conf parameters:
checkpoint_segments, checkpoint_timeout, wal_keep_segments?

--
Amit Langote


From: German Becker <german(dot)becker(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL segments (names) not in a sequence
Date: 2013-05-24 18:50:30
Message-ID: CALyjCLvem97TbuTq0ZihYurfwetW5qz5fOVa80kSnXdxPO6a9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, May 24, 2013 at 3:24 PM, Amit Langote <amitlangote09(at)gmail(dot)com>wrote:

> On Sat, May 25, 2013 at 3:08 AM, German Becker <german(dot)becker(at)gmail(dot)com>
> wrote:
> > Thanks Amit, I understand now. Is there a way to know/predict how many
> > prealocated segments will there be in a certain moment? What does it
> deppend
> > on?
>
> Upthread, Fujii Masao-san suggested what might have happened that
> caused these pre-allocated segments to be created. To quote him:
>
> "WAL recycling is performed by checkpoint. Checkpoint always checks
> whether there are WAL files no longer required for crash recovery,
> IOW, WAL files which were generated before the prior checkpoint
> happened, and then if they are found, checkpoint tries to recycle
> them."
>
> Reading here would also help:
> http://www.postgresql.org/docs/9.1/static/wal-configuration.html
>
> If you are still using the same values as during this observation,
> could you provide values for these postgresql.conf parameters:
> checkpoint_segments, checkpoint_timeout, wal_keep_segments?
>
>
> --
> Amit Langote
>

Amit,

Frist, thanks for your help and your interest. I (think) I understand how
checkpoint /wal segment work. What I didn't understand from the
documentation is the possibility of segments being prealocated. I thought
that the WAL segment with the higher sequence number is the one being
written at present time, as opposed to a segment allocated to be written in
the future. If you could clarify this point to me, that would be great.

Here are the parameters related to checkpoint

For "restore" config
checkpoint_segments = 256 # in logfile segments, min 1, 16MB
each
#checkpoint_timeout = 5min # range 30s-1h
#checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 -
1.0
#checkpoint_warning = 30s

#wal_keep_segments = 0 # in logfile segments, 16MB each; 0 disables

For "production" config:
checkpoint_segments = 256 # in logfile segments, min 1, 16MB
each
#checkpoint_timeout = 5min # range 30s-1h
checkpoint_completion_target = 0.7 # checkpoint target duration, 0.0 -
1.0
wal_keep_segments = 256 # in logfile segments, 16MB each; 0 disables

Thanks again,

Germán