Re: Performance features the 4th

Lists: pgsql-hackers
From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Performance features the 4th
Date: 2003-11-05 19:06:58
Message-ID: 3FA94A52.8070603@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I've just uploaded

http://developer.postgresql.org/~wieck/all_performance.v4.74.diff.gz

This patch contains the "still not yet ready" performance improvements
discussed over the couple last days.

_Shared buffer replacement_:

The buffer replacement strategy is a slightly modified version of ARC.
The modifications are some specializations about CDB promotions. Since
PostgreSQL allways looks for buffers multiple times when updating (first
during the scan, then during the heap_update() etc.), every updated
block would jump right into the T2 (frequent accessed) queue. To prevent
that the Xid when a buffer got added to the T1 queue is remembered and
if a block is found in T1, the same transaction will not promote it into
T2. This also affects blocks accessed like SELECT ... FOR UPDATE; UPDATE
as this is a usual strategy and does not mean that this particular datum
is accessed frequently.

Blocks faulted in by vacuum are handled special in that they end up at
the LRU of the T1 queue and when evicted from there their CDB get's
destroyed instead of added to the B1 queue to prevent vacuum from
polluting the caches autotuning.

A guc variable

buffer_strategy_status_interval = 0 # 0-600 seconds

controls DEBUG1 messages every n seconds showing the current queue sizes
and the cache hitrates during the last interval.

_Vacuum page delay_:

Tom Lane's napping during vacuums with another tuning option. I replaced
the usleep() call with a PG_DELAY(msec) macro in miscadmin.h, which does
use select(2) instead. That should address the possible portability
problems.

The config options

vacuum_page_group_delay = 0 # 0-100 milliseconds
vacuum_page_group_size = 10 # 1-1000 pages

control how many pages get vacuumed as a group and how long vacuum will
nap between groups.

I think this can be improved more if vacuum get's feedback from the
buffer manager if a page actually was found clean or already dirty in
the cache or faulted in. This together with the fact if vacuum actually
dirties the page or not would result in a sort of "vacuum page cost"
that is accumulated and controls how often to nap. So that vacuuming a
page found in the cache and that has no dead tuples is cheap, but
vacuuming a page that caused another dirty block to get evicted, then
read in and finally ends up dirty because of dead tuples is expensive.

_Lazy checkpoint_:

This is the checkpoint process with the ability to schedule the buffer
flushing over some time. Also the buffers are written in an order told
by the buffer replacement strategy. Currently that is a merged list of
dirty buffers in the order of the T1 and T2 queues of ARC. Since buffers
are replaced in that order, it causes backends to find clean buffers for
eviction more often.

The config options

lazy_checkpoint_time = 0 # 0-3600 seconds
lazy_checkpoint_group_size = 50 # 10-1000 pages
lazy_checkpoint_maxdelay = 500 # 100-1000 milliseconds

control how long the buffer flushing "should" take, how many dirty pages
to write as a group before syncing and napping. The maxdelay is a
parameter that causes really small amounts of changes not to spread out
over that long.

The syncing is currently done in a new function in md.c, mdfsyncrecent()
called through the smgr. The intention is to maintain some LRU of
written to file descriptors and pg_fdatasync() them. I haven't found the
right place for that yet, so it simply does a system global sync().

My idea here is that it really does not matter how accurate the single
files are forced to disk during this, all we care for is to cause some
physical writes performed by the kernel while we're writing them out,
and not to buffer those writes in the OS until we finish the checkpoint.

The lazy checkpoint configuration should only affect automatic
checkpoints started by postmaster because a checkpoint_timeout occured.
Acutally it seems to apply this to manually started checkpoints as well.
BufferSync() monitors the time to finish, held in shared memory, so it
would be relatively easy to hurry up a running lazy checkpoint by
setting that to zero. It's just that the postmaster can't do that
because he does not have a PGPROC structure and therefore can't lock
that shmem structure. This is a must fix item because to hurry up the
checkpointer is very critical at shutdown time.

_TODO_:

* Replace the global sync() in mdfsyncrecent(int max) with calls to
pg_fdatasync()

* Add functionality to postmaster to hurry up a running checkpoint
at shutdown.

* Make sure that manual checkpoints are not affected by the lazy
checkpoint config options and that they too hurry up a running one.

* Further improve vacuums napping strategy depending on actual caused
IO per page.

_NOTE_:

The core team is well aware of the high demand for these features. As
things stand however, it is impossible to get this functionality
released in version 7.4.

That does not mean, that we have no chance to include some or all of the
functionality in a subsequent 7.4.x release. But for that to happen, the
above already mentioned TODO's must get done first. Further, we need a
good amount of evidence that these changes actually gain the desired
effect to a degree that justifies breaking our "no features in dot
releases" rule. Also we need a good amount of evidence that the features
don't break anything or sacrifice stability and that a backward
compatible behaviour (where possible ... not possible with ARC vs. LRU)
is the default.

I personally would like to see this work included in a 7.4.x release.
But it requires people to actually run tests, stress some hardware,
check platform portability and *give us feedback*, bacause this is what
we get for the release candidates and these improvements can under no
circumstance have any lower quality than that. If this goes into a 7.4.x
release and there is any platform dependant issue in it, it endangers
the timely fix of other bugs for those platforms, and that's a no-go.

Happy testing

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #


From: Manfred Spraul <manfred(at)colorfullife(dot)com>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance features the 4th
Date: 2003-11-05 19:53:21
Message-ID: 3FA95531.3000605@colorfullife.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jan Wieck wrote:

>
> _Vacuum page delay_:
>
> Tom Lane's napping during vacuums with another tuning option. I
> replaced the usleep() call with a PG_DELAY(msec) macro in miscadmin.h,
> which does use select(2) instead. That should address the possible
> portability problems.

What about skipping the delay if there are no outstanding disk
operations? Then vacuum would get the full disk bandwidth if the system
is idle.

--
Manfred


From: Neil Conway <neilc(at)samurai(dot)com>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance features the 4th
Date: 2003-11-05 20:08:53
Message-ID: 87k76ek27e.fsf@mailbox.samurai.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
> This patch contains the "still not yet ready" performance improvements
> discussed over the couple last days.

Cool stuff!

> The buffer replacement strategy is a slightly modified version of
> ARC.

BTW Jan, I got your message about taking a look at the ARC code; I'm
really busy at the moment, but I'll definitely take a look at it when
I get a chance.

> I personally would like to see this work included in a 7.4.x
> release.

Personally, I can't see any circumstance under which I would view this
as appropriate for integration into the 7.4 branch -- the changes this
patch introduces are pretty fundamental to the system; even with
testing I'd rather not see a stable release series potentially
destabilized. Furthermore, it's not as if these performance issues
have been recently discovered: we've been aware of most of them for at
least one or two prior releases (if not much longer).

-Neil


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Manfred Spraul <manfred(at)colorfullife(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance features the 4th
Date: 2003-11-05 20:11:18
Message-ID: 3FA95966.3040100@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Manfred Spraul wrote:

> Jan Wieck wrote:
>
>>
>> _Vacuum page delay_:
>>
>> Tom Lane's napping during vacuums with another tuning option. I
>> replaced the usleep() call with a PG_DELAY(msec) macro in miscadmin.h,
>> which does use select(2) instead. That should address the possible
>> portability problems.
>
> What about skipping the delay if there are no outstanding disk
> operations? Then vacuum would get the full disk bandwidth if the system
> is idle.

All we could do is to monitor our own recent activity. I doubt that
anything else would be portable. And on a dedicated DB server that is
very close to the truth anyway.

How portable is getrusage()? Could the postmaster issue that frequently
for RUSAGE_CHILDREN and leave the result somewhere in the shared memory
for whoever is concerned?

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance features the 4th
Date: 2003-11-05 20:28:37
Message-ID: 3FA95D75.2030003@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Neil Conway wrote:

> Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
>> This patch contains the "still not yet ready" performance improvements
>> discussed over the couple last days.
>
> Cool stuff!
>
>> The buffer replacement strategy is a slightly modified version of
>> ARC.
>
> BTW Jan, I got your message about taking a look at the ARC code; I'm
> really busy at the moment, but I'll definitely take a look at it when
> I get a chance.
>
>> I personally would like to see this work included in a 7.4.x
>> release.
>
> Personally, I can't see any circumstance under which I would view this
> as appropriate for integration into the 7.4 branch -- the changes this
> patch introduces are pretty fundamental to the system; even with
> testing I'd rather not see a stable release series potentially
> destabilized. Furthermore, it's not as if these performance issues
> have been recently discovered: we've been aware of most of them for at
> least one or two prior releases (if not much longer).

There are many aspects to this, and a full consensus will probably not
be reachable.

As a matter of fact, people who have performance problems are likely to
be the same who have upgrade problems. And as Gaetano pointed out
correctly, we will see wildforms with one or the other feature applied.

My opinion is that it is best for us as supporters and for the
reputation of PostgreSQL to try to keep the number of wildforms as small
as possible and to provide those features applied in the best possible
quality.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To:
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance features the 4th
Date: 2003-11-05 20:37:52
Message-ID: 3FA95FA0.2050105@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jan Wieck wrote:

>
> How portable is getrusage()? Could the postmaster issue that
> frequently for RUSAGE_CHILDREN and leave the result somewhere in the
> shared memory for whoever is concerned?
>
SVr4, BSD4.3, SUS2 and POSIX1003.1, I believe.

I also believe there is a M$ dll available that gives that functionality
(psapi.dll).

cheers

andrew


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance features the 4th
Date: 2003-11-05 20:49:54
Message-ID: 3FA96272.40804@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan wrote:

> Jan Wieck wrote:
>
>>
>> How portable is getrusage()? Could the postmaster issue that
>> frequently for RUSAGE_CHILDREN and leave the result somewhere in the
>> shared memory for whoever is concerned?
>>
> SVr4, BSD4.3, SUS2 and POSIX1003.1, I believe.
>
> I also believe there is a M$ dll available that gives that functionality
> (psapi.dll).

Remains the question when it is updated, the manpage doesn't tell. If
the RUSAGE_CHILDREN information is updated only when the child exits,
each backend has to do it.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #


From: Andrew Sullivan <andrew(at)libertyrms(dot)info>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance features the 4th
Date: 2003-11-05 20:56:06
Message-ID: 20031105205605.GB5064@libertyrms.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 05, 2003 at 03:08:53PM -0500, Neil Conway wrote:
> Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
> > I personally would like to see this work included in a 7.4.x
> > release.
>
> Personally, I can't see any circumstance under which I would view this
> as appropriate for integration into the 7.4 branch -- the changes this

As unhappy as I am to say so, I agree strongly. Dot releases don't
get anything like enough testing to make me comfortable with putting
this kind of patch into such a release. I'm just a user though.

A

--
----
Andrew Sullivan 204-4141 Yonge Street
Afilias Canada Toronto, Ontario Canada
<andrew(at)libertyrms(dot)info> M2P 2A8
+1 416 646 3304 x110


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: Manfred Spraul <manfred(at)colorfullife(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance features the 4th
Date: 2003-11-05 21:10:09
Message-ID: 10854.1068066609@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
> Manfred Spraul wrote:
>> What about skipping the delay if there are no outstanding disk
>> operations?

> How portable is getrusage()? Could the postmaster issue that frequently
> for RUSAGE_CHILDREN and leave the result somewhere in the shared memory
> for whoever is concerned?

How would that tell you about currently outstanding operations?

Manfred's idea is interesting but AFAICS completely unimplementable
in any portable fashion. You'd have to have hooks into the kernel.

regards, tom lane


From: Kurt Roeckx <Q(at)ping(dot)be>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance features the 4th
Date: 2003-11-05 21:16:11
Message-ID: 20031105211611.GB21364@ping.be
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 05, 2003 at 03:49:54PM -0500, Jan Wieck wrote:
> Andrew Dunstan wrote:
>
> >Jan Wieck wrote:
> >
> >>
> >>How portable is getrusage()? Could the postmaster issue that
> >>frequently for RUSAGE_CHILDREN and leave the result somewhere in the
> >>shared memory for whoever is concerned?
> >>
> >SVr4, BSD4.3, SUS2 and POSIX1003.1, I believe.
> >
> >I also believe there is a M$ dll available that gives that functionality
> >(psapi.dll).
>
> Remains the question when it is updated, the manpage doesn't tell. If
> the RUSAGE_CHILDREN information is updated only when the child exits,
> each backend has to do it.

"If the value of the who argument is RUSAGE_CHILDREN,
information shall be returned about resources used by the
terminated and waited-for children of the current process"

Kurt


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: Neil Conway <neilc(at)samurai(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance features the 4th
Date: 2003-11-05 21:29:50
Message-ID: 13352.1068067790@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
> As a matter of fact, people who have performance problems are likely to
> be the same who have upgrade problems. And as Gaetano pointed out
> correctly, we will see wildforms with one or the other feature applied.

I'd believe that for patches of the size of my original VACUUM-delay
hack (or even a production-grade version of same, which'd probably be
10x larger). The kind of wholesale rewrite you are currently proposing
is much too large to consider folding back into 7.4.*, IMHO.

regards, tom lane


From: Manfred Spraul <manfred(at)colorfullife(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jan Wieck <JanWieck(at)Yahoo(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance features the 4th
Date: 2003-11-05 21:30:16
Message-ID: 3FA96BE8.3050200@colorfullife.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:

>Manfred's idea is interesting but AFAICS completely unimplementable
>in any portable fashion. You'd have to have hooks into the kernel.
>
>
I thought about outstanding operations from postgres - I don't know
enough about the buffer layer if it's possible to keep a counter of the
currently running read() and write() operations, or something similar.

--
Manfred


From: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jan Wieck <JanWieck(at)yahoo(dot)com>, Neil Conway <neilc(at)samurai(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance features the 4th
Date: 2003-11-05 22:06:40
Message-ID: 3FA97470.3020803@zeut.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:

>Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
>
>
>>As a matter of fact, people who have performance problems are likely to
>>be the same who have upgrade problems. And as Gaetano pointed out
>>correctly, we will see wildforms with one or the other feature applied.
>>
>>
>
>I'd believe that for patches of the size of my original VACUUM-delay
>hack (or even a production-grade version of same, which'd probably be
>10x larger). The kind of wholesale rewrite you are currently proposing
>is much too large to consider folding back into 7.4.*, IMHO.
>
>
Do people think that the VACUUM-delay patch by itself, would be usefully
enough on it's own to consider working it into 7.4.1 or something? From
the little feedback I have read on the VACUUM-delay patch used in
isolation, it certainly does help. I would love to see it put into 7.4
somehow.

The far more rigorous changes that Jan is working on, will be welcome
improvements for 7.5.


From: "Stephen" <jleelim(at)xxxxxxx(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Performance features the 4th
Date: 2003-11-05 23:15:10
Message-ID: 7vfqb.14072$5C1.10192@nntp-post.primus.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Yes, I would like to see the vacuum delay patch go into 7.4.1 if possible.
It's really useful. I don't think there is any major risk in adding the
delay patch into a minor revision given the small amount of code change.

Stephen

""Matthew T. O'Connor"" <matthew(at)zeut(dot)net> wrote in message
news:3FA97470(dot)3020803(at)zeut(dot)net(dot)(dot)(dot)
> Tom Lane wrote:
>
> >Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
> >
> >
> >>As a matter of fact, people who have performance problems are likely to
> >>be the same who have upgrade problems. And as Gaetano pointed out
> >>correctly, we will see wildforms with one or the other feature applied.
> >>
> >>
> >
> >I'd believe that for patches of the size of my original VACUUM-delay
> >hack (or even a production-grade version of same, which'd probably be
> >10x larger). The kind of wholesale rewrite you are currently proposing
> >is much too large to consider folding back into 7.4.*, IMHO.
> >
> >
> Do people think that the VACUUM-delay patch by itself, would be usefully
> enough on it's own to consider working it into 7.4.1 or something? From
> the little feedback I have read on the VACUUM-delay patch used in
> isolation, it certainly does help. I would love to see it put into 7.4
> somehow.
>
> The far more rigorous changes that Jan is working on, will be welcome
> improvements for 7.5.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jan Wieck <JanWieck(at)yahoo(dot)com>, Neil Conway <neilc(at)samurai(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance features the 4th
Date: 2003-11-07 13:09:22
Message-ID: 200311071309.hA7D9MC19596@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
> > As a matter of fact, people who have performance problems are likely to
> > be the same who have upgrade problems. And as Gaetano pointed out
> > correctly, we will see wildforms with one or the other feature applied.
>
> I'd believe that for patches of the size of my original VACUUM-delay
> hack (or even a production-grade version of same, which'd probably be
> 10x larger). The kind of wholesale rewrite you are currently proposing
> is much too large to consider folding back into 7.4.*, IMHO.

What Jan could do is to have a 7.4 patch available that people can test,
and he can improve it during the 7.5 development cycle with feedback
from users.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Christopher Browne <cbbrowne(at)acm(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Performance features the 4th
Date: 2003-11-07 13:36:42
Message-ID: m3ptg42tcl.fsf@wolfe.cbbrowne.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

A long time ago, in a galaxy far, far away, pgman(at)candle(dot)pha(dot)pa(dot)us (Bruce Momjian) wrote:
> Tom Lane wrote:
>> Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
>> > As a matter of fact, people who have performance problems are likely to
>> > be the same who have upgrade problems. And as Gaetano pointed out
>> > correctly, we will see wildforms with one or the other feature applied.
>>
>> I'd believe that for patches of the size of my original VACUUM-delay
>> hack (or even a production-grade version of same, which'd probably be
>> 10x larger). The kind of wholesale rewrite you are currently proposing
>> is much too large to consider folding back into 7.4.*, IMHO.
>
> What Jan could do is to have a 7.4 patch available that people can test,
> and he can improve it during the 7.5 development cycle with feedback
> from users.

The thing is, there are two patches that seem likely to be of
interest:

a) There's the ARC changes, which really feel like they are 7.5
development, not likely to be readily backportable;

b) On the other hand, a "simple delay" on the VACUUM seems likely
to be useful, and reasonably backportable.

And these are two quite different things, both of which may be worth
having.
--
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','acm.org').
http://www.ntlug.org/~cbbrowne/unix.html
If I could put Klein in a bottle...


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Christopher Browne <cbbrowne(at)acm(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Performance features the 4th
Date: 2003-11-07 15:27:25
Message-ID: 200311071527.hA7FRPm03927@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Christopher Browne wrote:
> A long time ago, in a galaxy far, far away, pgman(at)candle(dot)pha(dot)pa(dot)us (Bruce Momjian) wrote:
> > Tom Lane wrote:
> >> Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
> >> > As a matter of fact, people who have performance problems are likely to
> >> > be the same who have upgrade problems. And as Gaetano pointed out
> >> > correctly, we will see wildforms with one or the other feature applied.
> >>
> >> I'd believe that for patches of the size of my original VACUUM-delay
> >> hack (or even a production-grade version of same, which'd probably be
> >> 10x larger). The kind of wholesale rewrite you are currently proposing
> >> is much too large to consider folding back into 7.4.*, IMHO.
> >
> > What Jan could do is to have a 7.4 patch available that people can test,
> > and he can improve it during the 7.5 development cycle with feedback
> > from users.
>
> The thing is, there are two patches that seem likely to be of
> interest:
>
> a) There's the ARC changes, which really feel like they are 7.5
> development, not likely to be readily backportable;
>
> b) On the other hand, a "simple delay" on the VACUUM seems likely
> to be useful, and reasonably backportable.
>
> And these are two quite different things, both of which may be worth
> having.

Yes, Tom has already said "b" is possible in a 7.4.X subrelease, but not
for 7.4.0.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Christopher Browne <cbbrowne(at)acm(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Performance features the 4th
Date: 2003-11-07 15:56:02
Message-ID: 3FABC092.6060301@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Christopher Browne wrote:

> A long time ago, in a galaxy far, far away, pgman(at)candle(dot)pha(dot)pa(dot)us (Bruce Momjian) wrote:
>> Tom Lane wrote:
>>> Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
>>> > As a matter of fact, people who have performance problems are likely to
>>> > be the same who have upgrade problems. And as Gaetano pointed out
>>> > correctly, we will see wildforms with one or the other feature applied.
>>>
>>> I'd believe that for patches of the size of my original VACUUM-delay
>>> hack (or even a production-grade version of same, which'd probably be
>>> 10x larger). The kind of wholesale rewrite you are currently proposing
>>> is much too large to consider folding back into 7.4.*, IMHO.
>>
>> What Jan could do is to have a 7.4 patch available that people can test,
>> and he can improve it during the 7.5 development cycle with feedback
>> from users.
>
> The thing is, there are two patches that seem likely to be of
> interest:
>
> a) There's the ARC changes, which really feel like they are 7.5
> development, not likely to be readily backportable;
>
> b) On the other hand, a "simple delay" on the VACUUM seems likely
> to be useful, and reasonably backportable.
>
> And these are two quite different things, both of which may be worth
> having.

I only need to know the three W's, when, what and where (when do people
want what pieces of the stuff where?).

However, I have not seen much evidence yet that the vacuum delay alone
does that much. In conjunction with putting vacuum dirtied blocks at LRU
instead of MRU maybe, but that's again another functional change. So I
am not sure what the outcome of that for 7.4 is. The general opinion is
that the whole thing is too much. But nobody has done anything to show
how the vacuum delay alone compares to that.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: Christopher Browne <cbbrowne(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Performance features the 4th
Date: 2003-11-07 16:41:32
Message-ID: 29601.1068223292@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
> However, I have not seen much evidence yet that the vacuum delay alone
> does that much.

Gaetano and a couple of other people did experiments that seemed to show
it was useful. I think we'd want to change the shape of the knob per
later suggestions (sleep 10 ms every N blocks, instead of N ms every
block) but it did seem that there was useful bang for little buck there.

regards, tom lane


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Christopher Browne <cbbrowne(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Performance features the 4th
Date: 2003-11-07 19:33:00
Message-ID: 3FABF36C.4060109@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:

> Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
>> However, I have not seen much evidence yet that the vacuum delay alone
>> does that much.
>
> Gaetano and a couple of other people did experiments that seemed to show
> it was useful. I think we'd want to change the shape of the knob per
> later suggestions (sleep 10 ms every N blocks, instead of N ms every
> block) but it did seem that there was useful bang for little buck there.

I thought it was "sleep N ms every M blocks".

Have we seen any numbers? Anything at all? Something that gives us a
clue by what factor one has to multiply the total time a "VACUUM
ANALYZE" takes, to get what effect in return?

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #


From: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Christopher Browne <cbbrowne(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Performance features the 4th
Date: 2003-11-07 20:25:58
Message-ID: 004e01c3a56d$5a8eeb80$5200a8c0@TERRIE
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

----- Original Message -----
From: "Jan Wieck" <JanWieck(at)Yahoo(dot)com>
> Tom Lane wrote:
> > Gaetano and a couple of other people did experiments that seemed to show
> > it was useful. I think we'd want to change the shape of the knob per
> > later suggestions (sleep 10 ms every N blocks, instead of N ms every
> > block) but it did seem that there was useful bang for little buck there.
>
> I thought it was "sleep N ms every M blocks".
>
> Have we seen any numbers? Anything at all? Something that gives us a
> clue by what factor one has to multiply the total time a "VACUUM
> ANALYZE" takes, to get what effect in return?

I have some time on sunday to do some testing. Is there a patch that I can
apply that implements either of the two options? (sleep 10ms every M blocks
or sleep N ms every M blocks).

I know Tom posted the original patch that sleept N ms every 1 block (where N
is > 10 due to OS limitations). Jan can you post a patch that has just the
sleep code in it? Or should it be easy enough for me to cull out of the
larger patch you posted?


From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
Cc: Jan Wieck <JanWieck(at)Yahoo(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christopher Browne <cbbrowne(at)acm(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance features the 4th
Date: 2003-11-07 22:25:30
Message-ID: Pine.LNX.4.33.0311071523120.14553-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 7 Nov 2003, Matthew T. O'Connor wrote:

> ----- Original Message -----
> From: "Jan Wieck" <JanWieck(at)Yahoo(dot)com>
> > Tom Lane wrote:
> > > Gaetano and a couple of other people did experiments that seemed to show
> > > it was useful. I think we'd want to change the shape of the knob per
> > > later suggestions (sleep 10 ms every N blocks, instead of N ms every
> > > block) but it did seem that there was useful bang for little buck there.
> >
> > I thought it was "sleep N ms every M blocks".
> >
> > Have we seen any numbers? Anything at all? Something that gives us a
> > clue by what factor one has to multiply the total time a "VACUUM
> > ANALYZE" takes, to get what effect in return?
>
> I have some time on sunday to do some testing. Is there a patch that I can
> apply that implements either of the two options? (sleep 10ms every M blocks
> or sleep N ms every M blocks).
>
> I know Tom posted the original patch that sleept N ms every 1 block (where N
> is > 10 due to OS limitations). Jan can you post a patch that has just the
> sleep code in it? Or should it be easy enough for me to cull out of the
> larger patch you posted?

The reason for the change is that the minumum sleep period on many systems
is 10mS, which meant that vacuum was running 20X slower than normal.
While it might be necessary in certain very I/O starved situations to make
it this slow, it would probably be better to be able to get a vacuum that
ran at about 1/2 to 1/5 speed for most folks. So, since the delta can't
less than 10mS on most systems, it's better to just leave it at a fixed
amount and change the number of pages vacuumed per sleep.

I'm certainly gonna test the patch out too. We aren't really I/O bound,
but it would be nice to have a database that only slowed down ~1% or so
during vacuuming.


From: Gaetano Mendola <mendola(at)bigfoot(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Performance features the 4th
Date: 2003-11-09 22:51:21
Message-ID: 3FAEC4E9.9010707@bigfoot.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:

> Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
>
>>However, I have not seen much evidence yet that the vacuum delay alone
>>does that much.
>
>
> Gaetano and a couple of other people did experiments that seemed to show
> it was useful. I think we'd want to change the shape of the knob per
> later suggestions (sleep 10 ms every N blocks, instead of N ms every
> block) but it did seem that there was useful bang for little buck there.

Right, I'd like to try know the patch: "sleep N ms every M blocks".
Can you please post this patch ?

BTW, I'll see if I'm able to apply it also to a 7.3.X ( our production
DB).

Regards
Gaetano Mendola


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
Cc: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christopher Browne <cbbrowne(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Performance features the 4th
Date: 2003-11-09 23:09:52
Message-ID: 3FAEC940.30408@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

scott.marlowe wrote:

> On Fri, 7 Nov 2003, Matthew T. O'Connor wrote:
>
>> ----- Original Message -----
>> From: "Jan Wieck" <JanWieck(at)Yahoo(dot)com>
>> > Tom Lane wrote:
>> > > Gaetano and a couple of other people did experiments that seemed to show
>> > > it was useful. I think we'd want to change the shape of the knob per
>> > > later suggestions (sleep 10 ms every N blocks, instead of N ms every
>> > > block) but it did seem that there was useful bang for little buck there.
>> >
>> > I thought it was "sleep N ms every M blocks".
>> >
>> > Have we seen any numbers? Anything at all? Something that gives us a
>> > clue by what factor one has to multiply the total time a "VACUUM
>> > ANALYZE" takes, to get what effect in return?
>>
>> I have some time on sunday to do some testing. Is there a patch that I can
>> apply that implements either of the two options? (sleep 10ms every M blocks
>> or sleep N ms every M blocks).
>>
>> I know Tom posted the original patch that sleept N ms every 1 block (where N
>> is > 10 due to OS limitations). Jan can you post a patch that has just the
>> sleep code in it? Or should it be easy enough for me to cull out of the
>> larger patch you posted?
>
> The reason for the change is that the minumum sleep period on many systems
> is 10mS, which meant that vacuum was running 20X slower than normal.
> While it might be necessary in certain very I/O starved situations to make
> it this slow, it would probably be better to be able to get a vacuum that
> ran at about 1/2 to 1/5 speed for most folks. So, since the delta can't
> less than 10mS on most systems, it's better to just leave it at a fixed
> amount and change the number of pages vacuumed per sleep.

I disagree with that. If you limit yourself to the number of pages being
the only knob you have and set the napping time fixed, you can only
lower the number of sequentially read pages to slow it down. Making read
ahead absurd in an IO starved situation ...

I'll post a patch doing

every N pages nap for M milliseconds

using two GUC variables and based on a select(2) call later.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christopher Browne <cbbrowne(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Performance features the 4th
Date: 2003-11-09 23:42:53
Message-ID: 3FAED0FD.2020803@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Matthew T. O'Connor wrote:

> ----- Original Message -----
> From: "Jan Wieck" <JanWieck(at)Yahoo(dot)com>
>> Tom Lane wrote:
>> > Gaetano and a couple of other people did experiments that seemed to show
>> > it was useful. I think we'd want to change the shape of the knob per
>> > later suggestions (sleep 10 ms every N blocks, instead of N ms every
>> > block) but it did seem that there was useful bang for little buck there.
>>
>> I thought it was "sleep N ms every M blocks".
>>
>> Have we seen any numbers? Anything at all? Something that gives us a
>> clue by what factor one has to multiply the total time a "VACUUM
>> ANALYZE" takes, to get what effect in return?
>
> I have some time on sunday to do some testing. Is there a patch that I can
> apply that implements either of the two options? (sleep 10ms every M blocks
> or sleep N ms every M blocks).
>
> I know Tom posted the original patch that sleept N ms every 1 block (where N
> is > 10 due to OS limitations). Jan can you post a patch that has just the
> sleep code in it? Or should it be easy enough for me to cull out of the
> larger patch you posted?

Sorry for the delay, had to finish some other concept yesterday (will be
published soon).

The attached patch adds

vacuum_group_delay_size = 10 (range 1-1000)
vacuum_group_delay_msec = 0 (range 0-1000)

and does the sleeping via select(2). It does it only at the same places
where Tom had done the usleep() in his hack, so I guess there is still
some more to do besides the documentation, before it can be added to
7.4.1. But it should be enough to get some testing done.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

Attachment Content-Type Size
vacuum_group_delay.74.diff text/plain 6.4 KB

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Jan Wieck <JanWieck(at)yahoo(dot)com>
Cc: "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christopher Browne <cbbrowne(at)acm(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance features the 4th
Date: 2003-11-10 16:02:51
Message-ID: Pine.LNX.4.33.0311100900260.27327-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 9 Nov 2003, Jan Wieck wrote:

> scott.marlowe wrote:
>
> > On Fri, 7 Nov 2003, Matthew T. O'Connor wrote:
> >
> >> ----- Original Message -----
> >> From: "Jan Wieck" <JanWieck(at)Yahoo(dot)com>
> >> > Tom Lane wrote:
> >> > > Gaetano and a couple of other people did experiments that seemed to show
> >> > > it was useful. I think we'd want to change the shape of the knob per
> >> > > later suggestions (sleep 10 ms every N blocks, instead of N ms every
> >> > > block) but it did seem that there was useful bang for little buck there.
> >> >
> >> > I thought it was "sleep N ms every M blocks".
> >> >
> >> > Have we seen any numbers? Anything at all? Something that gives us a
> >> > clue by what factor one has to multiply the total time a "VACUUM
> >> > ANALYZE" takes, to get what effect in return?
> >>
> >> I have some time on sunday to do some testing. Is there a patch that I can
> >> apply that implements either of the two options? (sleep 10ms every M blocks
> >> or sleep N ms every M blocks).
> >>
> >> I know Tom posted the original patch that sleept N ms every 1 block (where N
> >> is > 10 due to OS limitations). Jan can you post a patch that has just the
> >> sleep code in it? Or should it be easy enough for me to cull out of the
> >> larger patch you posted?
> >
> > The reason for the change is that the minumum sleep period on many systems
> > is 10mS, which meant that vacuum was running 20X slower than normal.
> > While it might be necessary in certain very I/O starved situations to make
> > it this slow, it would probably be better to be able to get a vacuum that
> > ran at about 1/2 to 1/5 speed for most folks. So, since the delta can't
> > less than 10mS on most systems, it's better to just leave it at a fixed
> > amount and change the number of pages vacuumed per sleep.
>
> I disagree with that. If you limit yourself to the number of pages being
> the only knob you have and set the napping time fixed, you can only
> lower the number of sequentially read pages to slow it down. Making read
> ahead absurd in an IO starved situation ...
>
> I'll post a patch doing
>
> every N pages nap for M milliseconds
>
> using two GUC variables and based on a select(2) call later.

I didn't mean "fixed in the code" I meant in your setup. I.e. find a
delay (10mS, 50, 100 etc...) then vary the number of pages processed at a
time until you start to notice the load, then back it off.

Not being forced by the code to have one and only one delay value, setting
it yourself.