Re: drop duplicate buffers in OS

Lists: pgsql-hackers
From: KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: drop duplicate buffers in OS
Date: 2014-01-15 06:53:07
Message-ID: 52D63053.7000003@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I create patch that can drop duplicate buffers in OS using usage_count
alogorithm. I have developed this patch since last summer. This feature seems to
be discussed in hot topic, so I submit it more faster than my schedule.

When usage_count is high in shared_buffers, they are hard to drop from
shared_buffers. However, these buffers wasn't required in file cache. Because
they aren't accessed by postgres(postgres access to shared_buffers).
So I create algorithm that dropping file cache which is high usage_count in
shared_buffers and is clean state in OS. If file cache are clean state in OS, and
executing posix_fadvice DONTNEED, it can only free in file cache without writing
physical disk. This algorithm will solve double-buffered situation problem and
can use memory more efficiently.

I am testing DBT-2 benchmark now...

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Attachment Content-Type Size
drop_duplicate_buffers_v4.patch text/x-diff 13.5 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: drop duplicate buffers in OS
Date: 2014-01-15 18:34:32
Message-ID: CA+TgmoaqcS35rvmudPvk3xSTBH=NZwhE0884=8q34YuDWi_k5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 15, 2014 at 1:53 AM, KONDO Mitsumasa
<kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> I create patch that can drop duplicate buffers in OS using usage_count
> alogorithm. I have developed this patch since last summer. This feature seems to
> be discussed in hot topic, so I submit it more faster than my schedule.
>
> When usage_count is high in shared_buffers, they are hard to drop from
> shared_buffers. However, these buffers wasn't required in file cache. Because
> they aren't accessed by postgres(postgres access to shared_buffers).
> So I create algorithm that dropping file cache which is high usage_count in
> shared_buffers and is clean state in OS. If file cache are clean state in OS, and
> executing posix_fadvice DONTNEED, it can only free in file cache without writing
> physical disk. This algorithm will solve double-buffered situation problem and
> can use memory more efficiently.
>
> I am testing DBT-2 benchmark now...

The thing about this is that our usage counts for shared_buffers don't
really work right now; it's common for everything, or nearly
everything, to have a usage count of 5. So I'm reluctant to rely on
that for much of anything.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Aidan Van Dyk <aidan(at)highrise(dot)ca>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: drop duplicate buffers in OS
Date: 2014-01-16 12:38:06
Message-ID: CAC_2qU9x_nh1tj+guS-Qojjh=mD5zPm=uFczk-k36=d1LJ3RzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Can we just get the backend that dirties the page to the posix_fadvice
DONTNEED?

Or have another helper that sweeps the shared buffers and does this
post-first-dirty?

a.

On Wed, Jan 15, 2014 at 1:34 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Wed, Jan 15, 2014 at 1:53 AM, KONDO Mitsumasa
> <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> > I create patch that can drop duplicate buffers in OS using usage_count
> > alogorithm. I have developed this patch since last summer. This feature
> seems to
> > be discussed in hot topic, so I submit it more faster than my schedule.
> >
> > When usage_count is high in shared_buffers, they are hard to drop from
> > shared_buffers. However, these buffers wasn't required in file cache.
> Because
> > they aren't accessed by postgres(postgres access to shared_buffers).
> > So I create algorithm that dropping file cache which is high usage_count
> in
> > shared_buffers and is clean state in OS. If file cache are clean state
> in OS, and
> > executing posix_fadvice DONTNEED, it can only free in file cache without
> writing
> > physical disk. This algorithm will solve double-buffered situation
> problem and
> > can use memory more efficiently.
> >
> > I am testing DBT-2 benchmark now...
>
> The thing about this is that our usage counts for shared_buffers don't
> really work right now; it's common for everything, or nearly
> everything, to have a usage count of 5. So I'm reluctant to rely on
> that for much of anything.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

--
Aidan Van Dyk Create like a god,
aidan(at)highrise(dot)ca command like a king,
http://www.highrise.ca/ work like a
slave.


From: KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: drop duplicate buffers in OS
Date: 2014-01-17 07:33:11
Message-ID: 52D8DCB7.90503@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2014/01/16 21:38), Aidan Van Dyk wrote:
> Can we just get the backend that dirties the page to the posix_fadvice DONTNEED?
No, it can remove clean page in OS file caches. Because if page is dirtied, it
cause physical-disk-writing. However, it is experimental patch so it might be
changed by future benchmark testing.

> Or have another helper that sweeps the shared buffers and does this post-first-dirty?
We can add DropDuplicateOSCache() function to checkpointer process or other
process. And we can chenged posix_fadvice() DONTNEED to sync_file_range(). It can
cause physical-disk-writing in target buffer, not to free OS file caches.

I'm considering that sync_file_range() SYNC_FILE_RANGE_WAIT_BEFORE |
SYNC_FILE_RANGE_WRITE in executing checkpoint. It can avoid fsync freeze
situaition in part of of finnal checkpoint.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


From: KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: drop duplicate buffers in OS
Date: 2014-01-17 07:35:01
Message-ID: 52D8DD25.6090504@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2014/01/16 3:34), Robert Haas wrote:
> On Wed, Jan 15, 2014 at 1:53 AM, KONDO Mitsumasa
> <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> I create patch that can drop duplicate buffers in OS using usage_count
>> alogorithm. I have developed this patch since last summer. This feature seems to
>> be discussed in hot topic, so I submit it more faster than my schedule.
>>
>> When usage_count is high in shared_buffers, they are hard to drop from
>> shared_buffers. However, these buffers wasn't required in file cache. Because
>> they aren't accessed by postgres(postgres access to shared_buffers).
>> So I create algorithm that dropping file cache which is high usage_count in
>> shared_buffers and is clean state in OS. If file cache are clean state in OS, and
>> executing posix_fadvice DONTNEED, it can only free in file cache without writing
>> physical disk. This algorithm will solve double-buffered situation problem and
>> can use memory more efficiently.
>>
>> I am testing DBT-2 benchmark now...
>
> The thing about this is that our usage counts for shared_buffers don't
> really work right now; it's common for everything, or nearly
> everything, to have a usage count of 5. So I'm reluctant to rely on
> that for much of anything.
This patch aims to large shared_buffers situations, so 10% memory shared_buffers
situaition
might be not effective. This patch is in experimental and to show how to solve
the double-buffers for one of a example.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: drop duplicate buffers in OS
Date: 2014-01-28 23:20:04
Message-ID: CAMkU=1xoX7Dxu6V433gyrfgdyLEjVLev7L8xJ7w3MkJUay4FOw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 15, 2014 at 10:34 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Wed, Jan 15, 2014 at 1:53 AM, KONDO Mitsumasa
> <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> > I create patch that can drop duplicate buffers in OS using usage_count
> > alogorithm. I have developed this patch since last summer. This feature
> seems to
> > be discussed in hot topic, so I submit it more faster than my schedule.
> >
> > When usage_count is high in shared_buffers, they are hard to drop from
> > shared_buffers. However, these buffers wasn't required in file cache.
> Because
> > they aren't accessed by postgres(postgres access to shared_buffers).
> > So I create algorithm that dropping file cache which is high usage_count
> in
> > shared_buffers and is clean state in OS. If file cache are clean state
> in OS, and
> > executing posix_fadvice DONTNEED, it can only free in file cache without
> writing
> > physical disk. This algorithm will solve double-buffered situation
> problem and
> > can use memory more efficiently.
> >
> > I am testing DBT-2 benchmark now...
>

Have you had any luck with it? I have reservations about this approach.
Among other reasons, if the buffer is truly nailed in shared_buffers for
the long term, the kernel won't see any activity on it and will be able to
evict it fairly efficiently on its own.

So I'm reluctant to do a detailed review if the author cannot demonstrate a
performance improvement. I'm going to mark it waiting-on-author for that
reason.

>
> The thing about this is that our usage counts for shared_buffers don't
> really work right now; it's common for everything, or nearly
> everything, to have a usage count of 5.

I'm surprised that that is common. The only cases I've seen that was
either when the database exactly fits in shared_buffers, or when the
database is mostly appended, and the appends are done with inserts in a
loop rather than COPY.

Cheers,

Jeff


From: KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: drop duplicate buffers in OS
Date: 2014-01-29 07:53:28
Message-ID: 52E8B378.1030707@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Attached is latest patch.
I change little bit at PinBuffer() in bufmgr.c. It will evict target clean file
cache in OS more exactly.

- if (!(buf->flags & BM_FADVED) && !(buf->flags & BM_JUST_DIRTIED))
+ if (!(buf->flags & BM_DIRTY) && !(buf->flags & BM_FADVED) && !(buf->flags &
BM_JUST_DIRTIED))

(2014/01/29 8:20), Jeff Janes wrote:
> On Wed, Jan 15, 2014 at 10:34 AM, Robert Haas <robertmhaas(at)gmail(dot)com
> <mailto:robertmhaas(at)gmail(dot)com>> wrote:
>
> On Wed, Jan 15, 2014 at 1:53 AM, KONDO Mitsumasa
> <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp <mailto:kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>> wrote:
> > I create patch that can drop duplicate buffers in OS using usage_count
> > alogorithm. I have developed this patch since last summer. This feature
> seems to
> > be discussed in hot topic, so I submit it more faster than my schedule.
> >
> > When usage_count is high in shared_buffers, they are hard to drop from
> > shared_buffers. However, these buffers wasn't required in file cache. Because
> > they aren't accessed by postgres(postgres access to shared_buffers).
> > So I create algorithm that dropping file cache which is high usage_count in
> > shared_buffers and is clean state in OS. If file cache are clean state in
> OS, and
> > executing posix_fadvice DONTNEED, it can only free in file cache without
> writing
> > physical disk. This algorithm will solve double-buffered situation problem and
> > can use memory more efficiently.
> >
> > I am testing DBT-2 benchmark now...
>
>
> Have you had any luck with it? I have reservations about this approach. Among
> other reasons, if the buffer is truly nailed in shared_buffers for the long term,
> the kernel won't see any activity on it and will be able to evict it fairly
> efficiently on its own.
My patch aims not to evict other useful file cache in OS. If we doesn't evict
useful file caches in shered_buffers, they will be evicted with other useful file
cache in OS. But if we evict them as soon as possible, it will be difficult to
evict other useful file cache all the more.

> So I'm reluctant to do a detailed review if the author cannot demonstrate a
> performance improvement. I'm going to mark it waiting-on-author for that reason.
Will you review my patch? Thank you so much! However, My patch performance is be
little bit better. It might be error range. Optimize kernel readahead patch is grate.
Too readahead in OS is too bad, and to be full of not useful file cache in OS.
Here is the test result. Plain result is tested before(readahead patch test).

* Test server
Server: HP Proliant DL360 G7
CPU: Xeon E5640 2.66GHz (1P/4C)
Memory: 18GB(PC3-10600R-9)
Disk: 146GB(15k)*4 RAID1+0
RAID controller: P410i/256MB
OS: RHEL 6.4(x86_64)
FS: Ext4

* DBT-2 result(WH400, SESSION=100, ideal_score=5160)
Method | score | average | 90%tile | Maximum
------------------------------------------------
plain | 3589 | 9.751 | 33.680 | 87.8036
patched | 3799 | 9.914 | 22.451 | 119.4259

* Main Settings
shared_buffers= 2458MB
drop_duplicate_buffers = 5 // patched only

I tested benchmark with drop_duplicate_buffers = 3 and 4 settings. But I didn't
get good result. So I will test with more larger shared_buffers and these settings.

[detail settings]
http://pgstatsinfo.projects.pgfoundry.org/readahead_dbt2/normal_20140109/HTML/dbserver/param.out

* Detail results (uploading now. please wait for a hour...)
[plain]
http://pgstatsinfo.projects.pgfoundry.org/readahead_dbt2/normal_20140109/HTML/index_thput.html

[patched]
http://pgstatsinfo.projects.pgfoundry.org/drop_os_cache/drop_dupulicate_cache20140129/HTML/index_thput.html

We can see faster response time at OS witeback situation(maybe) and executing
CHECKPOINT. Because when these are happened, read transactions hit file cache
more in my patch. So responce times are better than plain.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Attachment Content-Type Size
remove_duplicate_buffers_v05.patch text/x-diff 13.6 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: KONDO Mitsumasa <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: drop duplicate buffers in OS
Date: 2014-03-04 21:47:03
Message-ID: CA+TgmoZ9VmvDd3YGAPyWtCicB0Q7GzRvuJpOf_+WdmX-8PrY1Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 29, 2014 at 2:53 AM, KONDO Mitsumasa
<kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> Attached is latest patch.
> I change little bit at PinBuffer() in bufmgr.c. It will evict target clean
> file cache in OS more exactly.
>
> - if (!(buf->flags & BM_FADVED) && !(buf->flags & BM_JUST_DIRTIED))
> + if (!(buf->flags & BM_DIRTY) && !(buf->flags & BM_FADVED) && !(buf->flags
> & BM_JUST_DIRTIED))
>
>
> (2014/01/29 8:20), Jeff Janes wrote:
>>
>> On Wed, Jan 15, 2014 at 10:34 AM, Robert Haas <robertmhaas(at)gmail(dot)com
>> <mailto:robertmhaas(at)gmail(dot)com>> wrote:
>>
>> On Wed, Jan 15, 2014 at 1:53 AM, KONDO Mitsumasa
>> <kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp <mailto:kondo(dot)mitsumasa(at)lab(dot)ntt(dot)co(dot)jp>>
>> wrote:
>> > I create patch that can drop duplicate buffers in OS using
>> usage_count
>> > alogorithm. I have developed this patch since last summer. This
>> feature
>> seems to
>> > be discussed in hot topic, so I submit it more faster than my
>> schedule.
>> >
>> > When usage_count is high in shared_buffers, they are hard to drop
>> from
>> > shared_buffers. However, these buffers wasn't required in file
>> cache. Because
>> > they aren't accessed by postgres(postgres access to
>> shared_buffers).
>> > So I create algorithm that dropping file cache which is high
>> usage_count in
>> > shared_buffers and is clean state in OS. If file cache are clean
>> state in
>> OS, and
>> > executing posix_fadvice DONTNEED, it can only free in file cache
>> without
>> writing
>> > physical disk. This algorithm will solve double-buffered situation
>> problem and
>> > can use memory more efficiently.
>> >
>> > I am testing DBT-2 benchmark now...
>>
>>
>> Have you had any luck with it? I have reservations about this approach.
>> Among
>> other reasons, if the buffer is truly nailed in shared_buffers for the
>> long term,
>> the kernel won't see any activity on it and will be able to evict it
>> fairly
>> efficiently on its own.
>
> My patch aims not to evict other useful file cache in OS. If we doesn't
> evict useful file caches in shered_buffers, they will be evicted with other
> useful file cache in OS. But if we evict them as soon as possible, it will
> be difficult to evict other useful file cache all the more.
>
>
>> So I'm reluctant to do a detailed review if the author cannot demonstrate
>> a
>> performance improvement. I'm going to mark it waiting-on-author for that
>> reason.
>
> Will you review my patch? Thank you so much! However, My patch performance
> is be
> little bit better. It might be error range. Optimize kernel readahead patch
> is grate.
> Too readahead in OS is too bad, and to be full of not useful file cache in
> OS.
> Here is the test result. Plain result is tested before(readahead patch
> test).
>
> * Test server
> Server: HP Proliant DL360 G7
> CPU: Xeon E5640 2.66GHz (1P/4C)
> Memory: 18GB(PC3-10600R-9)
> Disk: 146GB(15k)*4 RAID1+0
> RAID controller: P410i/256MB
> OS: RHEL 6.4(x86_64)
> FS: Ext4
>
> * DBT-2 result(WH400, SESSION=100, ideal_score=5160)
> Method | score | average | 90%tile | Maximum
> ------------------------------------------------
> plain | 3589 | 9.751 | 33.680 | 87.8036
> patched | 3799 | 9.914 | 22.451 | 119.4259
>
> * Main Settings
> shared_buffers= 2458MB
> drop_duplicate_buffers = 5 // patched only
>
> I tested benchmark with drop_duplicate_buffers = 3 and 4 settings. But I
> didn't get good result. So I will test with more larger shared_buffers and
> these settings.
>
> [detail settings]
> http://pgstatsinfo.projects.pgfoundry.org/readahead_dbt2/normal_20140109/HTML/dbserver/param.out
>
>
> * Detail results (uploading now. please wait for a hour...)
> [plain]
> http://pgstatsinfo.projects.pgfoundry.org/readahead_dbt2/normal_20140109/HTML/index_thput.html
> [patched]
> http://pgstatsinfo.projects.pgfoundry.org/drop_os_cache/drop_dupulicate_cache20140129/HTML/index_thput.html
>
> We can see faster response time at OS witeback situation(maybe) and
> executing CHECKPOINT. Because when these are happened, read transactions hit
> file cache more in my patch. So responce times are better than plain.

I think it's pretty clear that these results are not good enough to
justify committing this patch. To do something like this, we need to
have a lot of confidence that this will be a win not just on one
particular system or workload, but rather that it's got to be a
general win across many systems and workloads. I'm not convinced
that's true, and if it is true the test results submitted thus far are
nowhere near sufficient to establish it, and I can't see that changing
in the next few weeks. So I think it's pretty clear that we should
mark this Returned with Feedback for now.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company