PreallocXlogFiles

Lists: pgsql-hackers
From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: PreallocXlogFiles
Date: 2004-07-21 20:00:48
Message-ID: 1090438253.2658.1191.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I notice this:

When a checkpoint occurs, if a log file is more than 75% full then a new
file will be allocated (in PreallocXlogFiles).

This assumes we checkpoint at least 4 times per log file, otherwise it
will be effectively random whether we actually ever do this or not. With
an uneven or bursty workload, we would need to checkpoint many more
times per xlog to even notice this is ever being called. (I never have).

...but we don't check that anywhere in the code.

Since checkpoints now default to every 300 seconds, we are assuming that
a log file takes at least 20 minutes to fill with an even workload,
which is not the case on busy systems. On slow systems, who cares
whether we preallocate or not? Especially now that we have the bgwriter
to smooth the workload of backends.

The idea was to preallocate a file ahead of it being required...mostly
we just hit the endspot without having preallocated any log files, so
the preallocation thing is just a waste of time.

PreallocXlogFiles is only ever called during a normal Checkpoint or
after Recovery. In both cases, there will always be xlogs recycled and
so preallocation has already taken place (except in the trivial case of
the first few xlogs after an initdb).

I would like to remove PreallocXlogFiles on the basis that it is dead,
or at least pointless code.

Objections?

Best Regards, Simon Riggs


From: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PreallocXlogFiles
Date: 2004-07-21 23:35:43
Message-ID: Pine.LNX.4.58.0407220929210.6882@linuxworld.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 21 Jul 2004, Simon Riggs wrote:

> I notice this:
>
> When a checkpoint occurs, if a log file is more than 75% full then a new
> file will be allocated (in PreallocXlogFiles).
>
> This assumes we checkpoint at least 4 times per log file, otherwise it
> will be effectively random whether we actually ever do this or not. With
> an uneven or bursty workload, we would need to checkpoint many more
> times per xlog to even notice this is ever being called. (I never have).
>
> ...but we don't check that anywhere in the code.

I prefer the idea of just checking it more often than pulling the code out
all together. I think this sits well with Jan's work on consistent
availability (buffer manager, vacuum delay).

The question is, where to call it from. Its possible that the buffer
manager may have enough information to guess how often a new checkpoint
file should be preallocated. The alternative would be to have (yet
another) backend look after this.

Or, maybe the autovacuum backend could look after this. It would have
access to stats which may be useful but it would mean that people would
have to run autovacuum if they wanted checkpoints preallocated.

Thanks,

Gavin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Subject: Re: PreallocXlogFiles
Date: 2004-07-21 23:53:33
Message-ID: 23331.1090454013@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> I would like to remove PreallocXlogFiles on the basis that it is dead,
> or at least pointless code.

It could stand improvement I'm sure, but it's not pointless,
particularly not when you have archive mode turned on and so dead xlog
segments can't necessarily be recycled immediately. There's no
guarantee that there are very many segments available to be recycled
when a checkpoint happens, and so if you don't do some preallocation
you may find foreground processes forced to do the work instead when
they run out of forward xlog space.

If you assume a reasonably steady flow of xlog traffic and no
significant archiving delays, then you can see that the system settles
into a steady state where at each checkpoint about the same number of
old WAL files get rotated around to become forward xlog space, and
indeed there's little need for PreallocXlogFiles because MoveOfflineLogs
does all the heavy lifting.

However, I'm not at all convinced that this analysis holds up with
bursty traffic or when the archiver is delaying rotation of old xlogs.
If the number of physical WAL files needs to grow and shrink because
of such effects, then PreallocXlogFiles is the only thing that can
prevent foreground processes from having to do the work that should
be handled by the checkpointer.

I wonder whether we should not put back the preallocated-files GUC
parameter that Bruce took out a release or two back. PreallocXlogFiles
made a lot more sense back when that parameter existed.

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PreallocXlogFiles
Date: 2004-07-22 00:41:01
Message-ID: 1090456861.2658.1440.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 2004-07-22 at 00:35, Gavin Sherry wrote:
> On Wed, 21 Jul 2004, Simon Riggs wrote:
>
> > I notice this:
> >
> > When a checkpoint occurs, if a log file is more than 75% full then a new
> > file will be allocated (in PreallocXlogFiles).
> >
> > This assumes we checkpoint at least 4 times per log file, otherwise it
> > will be effectively random whether we actually ever do this or not. With
> > an uneven or bursty workload, we would need to checkpoint many more
> > times per xlog to even notice this is ever being called. (I never have).
> >
> > ...but we don't check that anywhere in the code.
>
> I prefer the idea of just checking it more often than pulling the code out
> all together. I think this sits well with Jan's work on consistent
> availability (buffer manager, vacuum delay).
>

Good idea. Hey - we could get archiver to do this, seeing as it knows
when the logs are full. Just do: I've seen a full one, I'll prealloc
another. No test, just alloc. (Or the bgwriter...)

On Thu, 2004-07-22 at 00:53, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > I would like to remove PreallocXlogFiles on the basis that it is dead,
> > or at least pointless code.
>
> I wonder whether we should not put back the preallocated-files GUC
> parameter that Bruce took out a release or two back. PreallocXlogFiles
> made a lot more sense back when that parameter existed.

That's simplest, especially if the code is there.

But again, if you set it to a constant value it's not really responding
to system demands, its just the admin's guess of what to set it to.

Gavin's idea sounds more optimal...

However, I'm not at all convinced that this analysis holds up with
> bursty traffic or when the archiver is delaying rotation of old xlogs.
> If the number of physical WAL files needs to grow and shrink because
> of such effects, then PreallocXlogFiles is the only thing that can
> prevent foreground processes from having to do the work that should
> be handled by the checkpointer.

Yes, I agree, but the checkpointer isn't waking up often enough
currently to do this effectively. It's just randomly doing it.

Best regards, Simon Riggs


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PreallocXlogFiles
Date: 2004-07-22 00:44:40
Message-ID: 23786.1090457080@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> Yes, I agree, but the checkpointer isn't waking up often enough
> currently to do this effectively. It's just randomly doing it.

Agreed. Maybe it should be part of the bgwriter's idle loop, and
not directly associated with checkpoints at all.

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PreallocXlogFiles
Date: 2004-07-22 07:46:59
Message-ID: 1090482419.2660.9.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 2004-07-22 at 01:44, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > Yes, I agree, but the checkpointer isn't waking up often enough
> > currently to do this effectively. It's just randomly doing it.
>
> Agreed. Maybe it should be part of the bgwriter's idle loop, and
> not directly associated with checkpoints at all.
>

Yes, thats a more natural home, now bgwriter exists. But does it know
when log files are full? How would it know?

Best Regards, Simon Riggs


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PreallocXlogFiles
Date: 2004-07-22 14:19:10
Message-ID: 9451.1090505950@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On Thu, 2004-07-22 at 01:44, Tom Lane wrote:
>> Agreed. Maybe it should be part of the bgwriter's idle loop, and
>> not directly associated with checkpoints at all.

> Yes, thats a more natural home, now bgwriter exists. But does it know
> when log files are full? How would it know?

It can run PreallocXlogFiles --- or more likely a modified version of
same. There isn't anything that function needs to do that the bgwriter
can't do (in fact, the bgwriter is what runs checkpoints now...)

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PreallocXlogFiles
Date: 2004-07-24 09:58:54
Message-ID: 1090663133.3057.97.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 2004-07-22 at 15:19, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > On Thu, 2004-07-22 at 01:44, Tom Lane wrote:
> >> Agreed. Maybe it should be part of the bgwriter's idle loop, and
> >> not directly associated with checkpoints at all.
>
> > Yes, thats a more natural home, now bgwriter exists. But does it know
> > when log files are full? How would it know?
>
> It can run PreallocXlogFiles --- or more likely a modified version of
> same. There isn't anything that function needs to do that the bgwriter
> can't do (in fact, the bgwriter is what runs checkpoints now...)
>

I can see roughly how to do this, but it is a can of worms I don't want
to open when I dont have much time. Some thoughts and ideas for later:

The Checkpoint code writes to xlog, so finds out what the recptr is for
free, then tries to act on that knowledge in PreallocXlogFiles.

Calling PreallocXlogFiles outside of the Checkpoint code is
straightforward to initiate from bgwriter.c, but the caller must have
already obtained the current recptr position. That would require
attempting to gain a lock on XLogCtl, then releasing it quickly after
having read the pointer. Then call Prealloc...

...Unless there is a heuristic to use, rather than exact knowledge of
the recptr...perhaps predicting something from the last 3 checkpoint
durations perhaps?

I'll return to this later.

Best Regards, Simon Riggs


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PreallocXlogFiles
Date: 2004-07-24 14:22:55
Message-ID: 4081.1090678975@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> Calling PreallocXlogFiles outside of the Checkpoint code is
> straightforward to initiate from bgwriter.c, but the caller must have
> already obtained the current recptr position. That would require
> attempting to gain a lock on XLogCtl, then releasing it quickly after
> having read the pointer. Then call Prealloc...

When I said "modified version", I meant that we'd change the function
to make it self-contained. Passing an already-obtained recptr is
convenient when it's being invoked at the end of Checkpoint, but to
be called from the bgwriter loop it should just get the necessary lock
and fetch the pointer for itself.

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PreallocXlogFiles
Date: 2004-07-24 20:31:05
Message-ID: 1090701065.3057.113.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2004-07-24 at 15:22, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > Calling PreallocXlogFiles outside of the Checkpoint code is
> > straightforward to initiate from bgwriter.c, but the caller must have
> > already obtained the current recptr position. That would require
> > attempting to gain a lock on XLogCtl, then releasing it quickly after
> > having read the pointer. Then call Prealloc...
>
> When I said "modified version", I meant that we'd change the function
> to make it self-contained. Passing an already-obtained recptr is
> convenient when it's being invoked at the end of Checkpoint, but to
> be called from the bgwriter loop it should just get the necessary lock
> and fetch the pointer for itself.
>

Whichever...I envisaged a new wrapper function in xlog.c, called from
bgwriter.c, rather than changing Prealloc..

Your way sounds better. Leave parms the same, just put an if
recptr==NULL then {get recptr} section of code.

Main point: nearly out of time, if I'm to finish other things on
must-complete list: docs and backup start/end function design.

Best Regards, Simon Riggs