Re: patch: add MAP_HUGETLB to mmap() where supported (WIP)

Lists: pgsql-hackers
From: Richard Poole <richard(at)2ndQuadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: patch: add MAP_HUGETLB to mmap() where supported (WIP)
Date: 2013-09-13 23:41:25
Message-ID: 20130913234125.GC13697@roobarb.crazydogs.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

The attached patch adds the MAP_HUGETLB flag to mmap() for shared memory
on systems that support it. It's based on Christian Kruse's patch from
last year, incorporating suggestions from Andres Freund.

On a system with 4GB shared_buffers, doing pgbench runs long enough for
each backend to touch most of the buffers, this patch saves nearly 8MB of
memory per backend and improves performances by just over 2% on average.

It is still WIP as there are a couple of points that Andres has pointed
out to me that haven't been addressed yet; also, the documentation is
incomplete.

Richard

--
Richard Poole http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
hugepages-v1.patch text/x-diff 11.0 KB

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Richard Poole <richard(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch: add MAP_HUGETLB to mmap() where supported (WIP)
Date: 2013-09-15 02:03:50
Message-ID: 1379210630.19286.27.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2013-09-14 at 00:41 +0100, Richard Poole wrote:
> The attached patch adds the MAP_HUGETLB flag to mmap() for shared
> memory on systems that support it.

Please fix the tabs in the SGML files.


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Richard Poole <richard(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch: add MAP_HUGETLB to mmap() where supported (WIP)
Date: 2013-09-16 08:15:28
Message-ID: 5236BE20.6040904@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 14.09.2013 02:41, Richard Poole wrote:
> The attached patch adds the MAP_HUGETLB flag to mmap() for shared memory
> on systems that support it. It's based on Christian Kruse's patch from
> last year, incorporating suggestions from Andres Freund.

I don't understand the logic in figuring out the pagesize, and the
smallest supported hugepage size. First of all, even without the patch,
why do we round up the size passed to mmap() to the _SC_PAGE_SIZE?
Surely the kernel will round up the request all by itself. The mmap()
man page doesn't say anything about length having to be a multiple of
pages size.

And with the patch, why do you bother detecting the minimum supported
hugepage size? Surely the kernel will choose the appropriate hugepage
size just fine on its own, no?

> It is still WIP as there are a couple of points that Andres has pointed
> out to me that haven't been addressed yet;

Which points are those?

I wonder if it would be better to allow setting huge_tlb_pages=try even
on platforms that don't have hugepages. It would simply mean the same as
'off' on such platforms.

- Heikki


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Richard Poole <richard(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch: add MAP_HUGETLB to mmap() where supported (WIP)
Date: 2013-09-16 10:15:38
Message-ID: 20130916101538.GK1330627@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-09-16 11:15:28 +0300, Heikki Linnakangas wrote:
> On 14.09.2013 02:41, Richard Poole wrote:
> >The attached patch adds the MAP_HUGETLB flag to mmap() for shared memory
> >on systems that support it. It's based on Christian Kruse's patch from
> >last year, incorporating suggestions from Andres Freund.
>
> I don't understand the logic in figuring out the pagesize, and the smallest
> supported hugepage size. First of all, even without the patch, why do we
> round up the size passed to mmap() to the _SC_PAGE_SIZE? Surely the kernel
> will round up the request all by itself. The mmap() man page doesn't say
> anything about length having to be a multiple of pages size.

I think it does:
EINVAL We don't like addr, length, or offset (e.g., they are too
large, or not aligned on a page boundary).
and
A file is mapped in multiples of the page size. For a file that is not a multiple
of the page size, the remaining memory is zeroed when mapped, and writes to that
region are not written out to the file. The effect of changing the size of the
underlying file of a mapping on the pages that correspond to added or removed
regions of the file is unspecified.

And no, according to my past experience, the kernel does *not* do any
such rounding up. It will just fail.

> And with the patch, why do you bother detecting the minimum supported
> hugepage size? Surely the kernel will choose the appropriate hugepage size
> just fine on its own, no?

It will fail if it's not a multiple.

> >It is still WIP as there are a couple of points that Andres has pointed
> >out to me that haven't been addressed yet;
>
> Which points are those?

I don't know which point Richard already has fixed, so I'll let him
comment on that.

> I wonder if it would be better to allow setting huge_tlb_pages=try even on
> platforms that don't have hugepages. It would simply mean the same as 'off'
> on such platforms.

I wouldn't argue against that.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Richard Poole <richard(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: patch: add MAP_HUGETLB to mmap() where supported (WIP)
Date: 2013-09-16 13:13:57
Message-ID: 52370415.6060108@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 16.09.2013 13:15, Andres Freund wrote:
> On 2013-09-16 11:15:28 +0300, Heikki Linnakangas wrote:
>> On 14.09.2013 02:41, Richard Poole wrote:
>>> The attached patch adds the MAP_HUGETLB flag to mmap() for shared memory
>>> on systems that support it. It's based on Christian Kruse's patch from
>>> last year, incorporating suggestions from Andres Freund.
>>
>> I don't understand the logic in figuring out the pagesize, and the smallest
>> supported hugepage size. First of all, even without the patch, why do we
>> round up the size passed to mmap() to the _SC_PAGE_SIZE? Surely the kernel
>> will round up the request all by itself. The mmap() man page doesn't say
>> anything about length having to be a multiple of pages size.
>
> I think it does:
> EINVAL We don't like addr, length, or offset (e.g., they are too
> large, or not aligned on a page boundary).

That doesn't mean that they *all* have to be aligned on a page boundary.
It's understandable that 'addr' and 'offset' have to be, but it doesn't
make much sense for 'length'.

> and
> A file is mapped in multiples of the page size. For a file that is not a multiple
> of the page size, the remaining memory is zeroed when mapped, and writes to that
> region are not written out to the file. The effect of changing the size of the
> underlying file of a mapping on the pages that correspond to added or removed
> regions of the file is unspecified.
>
> And no, according to my past experience, the kernel does *not* do any
> such rounding up. It will just fail.

I wrote a little test program to play with different values (attached).
I tried this on my laptop with a 3.2 kernel (uname -r: 3.10-2-amd6), and
on a VM with a fresh Centos 6.4 install with 2.6.32 kernel
(2.6.32-358.18.1.el6.x86_64), and they both work the same:

$ ./mmaptest 100 # mmap 100 bytes

in a different terminal:
$ cat /proc/meminfo | grep HugePages_Rsvd
HugePages_Rsvd: 1

So even a tiny allocation, much smaller than any page size, succeeds,
and it reserves a huge page. I tried the same with larger values; the
kernel always uses huge pages, and rounds up the allocation to a
multiple of the huge page size.

So, let's just get rid of the /sys scanning code.

Robert, do you remember why you put the "pagesize =
sysconf(_SC_PAGE_SIZE);" call in the new mmap() shared memory allocator?

- Heikki

Attachment Content-Type Size
mmaptest.c text/x-csrc 430 bytes

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Richard Poole <richard(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: patch: add MAP_HUGETLB to mmap() where supported (WIP)
Date: 2013-09-16 13:18:50
Message-ID: 20130916131850.GB5249@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-09-16 16:13:57 +0300, Heikki Linnakangas wrote:
> On 16.09.2013 13:15, Andres Freund wrote:
> >On 2013-09-16 11:15:28 +0300, Heikki Linnakangas wrote:
> >>On 14.09.2013 02:41, Richard Poole wrote:
> >>>The attached patch adds the MAP_HUGETLB flag to mmap() for shared memory
> >>>on systems that support it. It's based on Christian Kruse's patch from
> >>>last year, incorporating suggestions from Andres Freund.
> >>
> >>I don't understand the logic in figuring out the pagesize, and the smallest
> >>supported hugepage size. First of all, even without the patch, why do we
> >>round up the size passed to mmap() to the _SC_PAGE_SIZE? Surely the kernel
> >>will round up the request all by itself. The mmap() man page doesn't say
> >>anything about length having to be a multiple of pages size.
> >
> >I think it does:
> > EINVAL We don't like addr, length, or offset (e.g., they are too
> > large, or not aligned on a page boundary).
>
> That doesn't mean that they *all* have to be aligned on a page boundary.
> It's understandable that 'addr' and 'offset' have to be, but it doesn't make
> much sense for 'length'.
>
> >and
> > A file is mapped in multiples of the page size. For a file that is not a multiple
> > of the page size, the remaining memory is zeroed when mapped, and writes to that
> > region are not written out to the file. The effect of changing the size of the
> > underlying file of a mapping on the pages that correspond to added or removed
> > regions of the file is unspecified.
> >
> >And no, according to my past experience, the kernel does *not* do any
> >such rounding up. It will just fail.
>
> I wrote a little test program to play with different values (attached). I
> tried this on my laptop with a 3.2 kernel (uname -r: 3.10-2-amd6), and on a
> VM with a fresh Centos 6.4 install with 2.6.32 kernel
> (2.6.32-358.18.1.el6.x86_64), and they both work the same:
>
> $ ./mmaptest 100 # mmap 100 bytes
>
> in a different terminal:
> $ cat /proc/meminfo | grep HugePages_Rsvd
> HugePages_Rsvd: 1
>
> So even a tiny allocation, much smaller than any page size, succeeds, and it
> reserves a huge page. I tried the same with larger values; the kernel always
> uses huge pages, and rounds up the allocation to a multiple of the huge page
> size.

When developing the prototype I am pretty sure I had to add the rounding
up - but I am not sure why now, because after chatting with Heikki about
it, I've looked around and the initial MAP_HUGETLB support in the kernel
(commit 4e52780d41a741fb4861ae1df2413dd816ec11b1) has support for
rounding up.

> So, let's just get rid of the /sys scanning code.

Alternatively we could round up NBuffers to actually use the
additionally allocated space. Not sure if that's worth the amount of
code, but wasting several megabytes - or even gigabytes - of memory
isn't nice either.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Richard Poole <richard(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: patch: add MAP_HUGETLB to mmap() where supported (WIP)
Date: 2013-09-16 13:23:10
Message-ID: 20130916132310.GC5249@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-09-16 15:18:50 +0200, Andres Freund wrote:
> > So even a tiny allocation, much smaller than any page size, succeeds, and it
> > reserves a huge page. I tried the same with larger values; the kernel always
> > uses huge pages, and rounds up the allocation to a multiple of the huge page
> > size.
>
> When developing the prototype I am pretty sure I had to add the rounding
> up - but I am not sure why now, because after chatting with Heikki about
> it, I've looked around and the initial MAP_HUGETLB support in the kernel
> (commit 4e52780d41a741fb4861ae1df2413dd816ec11b1) has support for
> rounding up.

Ok, the reason for that seems to have been the following bug
https://bugzilla.kernel.org/show_bug.cgi?id=56881

Greetings,

Andres Freund


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Richard Poole <richard(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch: add MAP_HUGETLB to mmap() where supported (WIP)
Date: 2013-09-17 20:09:38
Message-ID: CA+Tgmob5HjYjMy6PwqOy7GDZKT=AmTUhuggtJj4a_RqQvJe6Jg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Sep 16, 2013 at 9:13 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> Robert, do you remember why you put the "pagesize = sysconf(_SC_PAGE_SIZE);"
> call in the new mmap() shared memory allocator?

Hmm, no. Unfortunately, I don't. We could try ripping it out and see
if the buildfarm breaks. If it is needed, then the dynamic shared
memory patch I posted probably needs it as well.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: andres(at)2ndquadrant(dot)com, hlinnakangas(at)vmware(dot)com
Subject: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-24 06:03:13
Message-ID: 20131024060313.GA21888@toroid.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi.

This is a slightly reworked version of the patch submitted by Richard
Poole last month, which was based on Christian Kruse's earlier patch.

Apart from doing various minor cleanups and documentation fixes, I also
tested this patch against HEAD on a machine with 256GB of RAM. Here's an
overview of the results.

I set nr_hugepages to 32768 (== 64GB), which (took a very long time and)
allowed me to set shared_buffers to 60GB. I then ran pgbench -s 1000 -i,
and did some runs of "pgbench -c 100 -j 10 -t 1000" with huge_tlb_pages
set to off and on respectively.

With huge_tlb_pages=off, this is the best result I got:

tps = 8680.771068 (including connections establishing)
tps = 8721.504838 (excluding connections establishing)

With huge_tlb_pages=on, this is the best result I got:

tps = 9932.245203 (including connections establishing)
tps = 9983.190304 (excluding connections establishing)

(Even the worst result I got in the latter case was a smidgen faster
than the best with huge_tlb_pages=off: 8796.344078 vs. 8721.504838.)

From /proc/$pid/status, VmPTE was 2880kb with huge_tlb_pages=off, and
56kb with it turned on.

One open question is what to do about rounding up the size. It should
not be necessary, but for the fairly recent bug described at the link
in the comment (https://bugzilla.kernel.org/show_bug.cgi?id=56881). I
tried it without the rounding-up, and it fails on Ubuntu's 3.5.0-28
kernel (mmap returns EINVAL).

Any thoughts?

-- Abhijit

Attachment Content-Type Size
hugepages-v3.patch text/x-diff 13.4 KB

From: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>
To:
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-24 06:06:51
Message-ID: 20131024060651.GB16636@toroid.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

At 2013-10-24 11:33:13 +0530, ams(at)2ndquadrant(dot)com wrote:
>
> >From /proc/$pid/status, VmPTE was 2880kb with huge_tlb_pages=off, and
> 56kb with it turned on.

(VmPTE is the size of the process's page tables.)

-- Abhijit


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, andres(at)2ndquadrant(dot)com
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-24 13:06:19
Message-ID: 52691B4B.10309@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 24.10.2013 09:03, Abhijit Menon-Sen wrote:
> This is a slightly reworked version of the patch submitted by Richard
> Poole last month, which was based on Christian Kruse's earlier patch.

Thanks.

> With huge_tlb_pages=off, this is the best result I got:
>
> tps = 8680.771068 (including connections establishing)
> tps = 8721.504838 (excluding connections establishing)
>
> With huge_tlb_pages=on, this is the best result I got:
>
> tps = 9932.245203 (including connections establishing)
> tps = 9983.190304 (excluding connections establishing)
>
> (Even the worst result I got in the latter case was a smidgen faster
> than the best with huge_tlb_pages=off: 8796.344078 vs. 8721.504838.)

That's really impressive.

> One open question is what to do about rounding up the size. It should
> not be necessary, but for the fairly recent bug described at the link
> in the comment (https://bugzilla.kernel.org/show_bug.cgi?id=56881). I
> tried it without the rounding-up, and it fails on Ubuntu's 3.5.0-28
> kernel (mmap returns EINVAL).

Let's get rid of the rounding. It's clearly a kernel bug, and it
shouldn't be our business to add workarounds for any kernel bug out
there. And the worst that will happen if you're running a buggy kernel
version is that you fall back to not using huge pages (assuming
huge_tlb_pages=try).

Other comments:

* guc.c doesn't actually need sys/mman.h for anything. Getting rid of
the #include also lets you remove the configure test.

* the documentation should perhaps mention that the setting only has an
effect if POSIX shared memory is used. That's the default on Linux, but
we will try to fall back to SystemV shared memory if it fails.

- Heikki


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-24 15:04:02
Message-ID: CA+TgmoYYodLk_DbH5fiD6Zs=E6To243c2iJ_jjs0042dsakdiA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 24, 2013 at 9:06 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> * the documentation should perhaps mention that the setting only has an
> effect if POSIX shared memory is used. That's the default on Linux, but we
> will try to fall back to SystemV shared memory if it fails.

This is true for dynamic shared memory, but not for the main shared
memory segment. The main shared memory segment is always the
combination of a small, fixed-size System V shared memory chunk and a
anonymous shared memory region created by mmap(NULL, ..., MAP_SHARED).
POSIX shared memory is not used.

(Exceptions: Anonymous shared memory isn't used on Windows, which has
its own mechanism, or when compiling with EXEC_BACKEND, when the whole
chunk is allocated as System V shared memory.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-24 17:00:28
Message-ID: 20131024170028.GC18793@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-10-24 16:06:19 +0300, Heikki Linnakangas wrote:
> On 24.10.2013 09:03, Abhijit Menon-Sen wrote:
> >One open question is what to do about rounding up the size. It should
> >not be necessary, but for the fairly recent bug described at the link
> >in the comment (https://bugzilla.kernel.org/show_bug.cgi?id=56881). I
> >tried it without the rounding-up, and it fails on Ubuntu's 3.5.0-28
> >kernel (mmap returns EINVAL).
>
> Let's get rid of the rounding. It's clearly a kernel bug, and it shouldn't
> be our business to add workarounds for any kernel bug out there. And the
> worst that will happen if you're running a buggy kernel version is that you
> fall back to not using huge pages (assuming huge_tlb_pages=try).

But it's a range of relatively popular kernels, that will stay around
for a good while. So I am hesitant to just not do anything about it. The
directory scanning code isn't that bad imo.

Either way:
I think we should log when we tried to use hugepages but fell back to
plain mmap, currently it's hard to see whether they are used.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-24 17:13:55
Message-ID: CA+TgmoZypzzdyVj1cpPJ9O-Nh-A9_Uqdz5w4Ete_QzMEoX01-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 24, 2013 at 1:00 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2013-10-24 16:06:19 +0300, Heikki Linnakangas wrote:
>> On 24.10.2013 09:03, Abhijit Menon-Sen wrote:
>> >One open question is what to do about rounding up the size. It should
>> >not be necessary, but for the fairly recent bug described at the link
>> >in the comment (https://bugzilla.kernel.org/show_bug.cgi?id=56881). I
>> >tried it without the rounding-up, and it fails on Ubuntu's 3.5.0-28
>> >kernel (mmap returns EINVAL).
>>
>> Let's get rid of the rounding. It's clearly a kernel bug, and it shouldn't
>> be our business to add workarounds for any kernel bug out there. And the
>> worst that will happen if you're running a buggy kernel version is that you
>> fall back to not using huge pages (assuming huge_tlb_pages=try).
>
> But it's a range of relatively popular kernels, that will stay around
> for a good while. So I am hesitant to just not do anything about it. The
> directory scanning code isn't that bad imo.
>
> Either way:
> I think we should log when we tried to use hugepages but fell back to
> plain mmap, currently it's hard to see whether they are used.

Logging it might be a good idea, but suppose the systems been running
for 6 months and you don't have the startup logs. Might be a good way
to have an easy way to discover later what happened back then.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
To: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, hlinnakangas(at)vmware(dot)com
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-30 03:52:34
Message-ID: CAL_0b1uwkAYuKf=Ga7rBzDdwL4-7TmV6fYBexCT4Cz17btoi_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Wed, Oct 23, 2013 at 11:03 PM, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com> wrote:
> This is a slightly reworked version of the patch submitted by Richard
> Poole last month, which was based on Christian Kruse's earlier patch.

Is it possible that this patch will be included in a minor version of
9.3? IMHO hugepages is a very important ability that postgres lost in
9.3, and it would be great to have it back ASAP.

Thank you.

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray(dot)ru(at)gmail(dot)com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
Cc: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, hlinnakangas(at)vmware(dot)com
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-30 04:31:54
Message-ID: 17939.1383107514@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Sergey Konoplev <gray(dot)ru(at)gmail(dot)com> writes:
> On Wed, Oct 23, 2013 at 11:03 PM, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com> wrote:
>> This is a slightly reworked version of the patch submitted by Richard
>> Poole last month, which was based on Christian Kruse's earlier patch.

> Is it possible that this patch will be included in a minor version of
> 9.3? IMHO hugepages is a very important ability that postgres lost in
> 9.3, and it would be great to have it back ASAP.

Say what? There's never been any hugepages support in Postgres.

regards, tom lane


From: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, andres(at)2ndquadrant(dot)com
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-30 04:46:57
Message-ID: 20131030044657.GD4183@toroid.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

At 2013-10-24 16:06:19 +0300, hlinnakangas(at)vmware(dot)com wrote:
>
> Let's get rid of the rounding.

I share Andres's concern that the bug is present in various recent
kernels that are going to stick around for quite some time. Given
the rather significant performance gain, I think it's worth doing
something, though I'm not a big fan of the directory-scanning code
myself.

As a compromise, perhaps we can unconditionally round the size up to be
a multiple of 2MB? That way, we can use huge pages more often, but also
avoid putting in a lot of code and effort into the workaround and waste
only a little space (if any at all).

> Other comments:
>
> * guc.c doesn't actually need sys/mman.h for anything. Getting rid
> of the #include also lets you remove the configure test.

You're right, guc.c doesn't use it any more; I've removed the #include.

sysv_shmem.c does use it (MAP_*, PROT_*), however, so I've left the test
in configure alone. I see that sys/mman.h is included elsewhere with an
#ifdef WIN32 or HAVE_SHM_OPEN guard, but HAVE_SYS_MMAN_H seems better.

> * the documentation should perhaps mention that the setting only has
> an effect if POSIX shared memory is used.

As Robert said, this is not correct, so I haven't changed anything.

-- Abhijit


From: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-30 04:58:26
Message-ID: 20131030045826.GE4183@toroid.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

At 2013-10-24 19:00:28 +0200, andres(at)2ndquadrant(dot)com wrote:
>
> I think we should log when we tried to use hugepages but fell back to
> plain mmap, currently it's hard to see whether they are used.

Good idea, thanks. I'll do this in the next patch I post (which will be
after we reach some consensus about how to handle the rounding problem).

-- Abhijit


From: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, hlinnakangas(at)vmware(dot)com
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-30 06:08:05
Message-ID: CAL_0b1taBrORqKE7wz=U=qUwjTk4ZyVjPfWUhwVgMyU2RD3moQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Oct 29, 2013 at 9:31 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Sergey Konoplev <gray(dot)ru(at)gmail(dot)com> writes:
>> On Wed, Oct 23, 2013 at 11:03 PM, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com> wrote:
>>> This is a slightly reworked version of the patch submitted by Richard
>>> Poole last month, which was based on Christian Kruse's earlier patch.
>
>> Is it possible that this patch will be included in a minor version of
>> 9.3? IMHO hugepages is a very important ability that postgres lost in
>> 9.3, and it would be great to have it back ASAP.
>
> Say what? There's never been any hugepages support in Postgres.

There were an ability to back shared memory with hugepages when using
<=9.2. I use it on ~30 servers for several years and it brings 8-17%
of performance depending on the memory size. Here you will find
several paragraphs of the description about how to do it
https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md.
Just search for the 'hugepages' word on the page.

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray(dot)ru(at)gmail(dot)com


From: David Fetter <david(at)fetter(dot)org>
To: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, hlinnakangas(at)vmware(dot)com
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-30 07:08:23
Message-ID: 20131030070823.GA13926@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Oct 29, 2013 at 11:08:05PM -0700, Sergey Konoplev wrote:
> On Tue, Oct 29, 2013 at 9:31 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Sergey Konoplev <gray(dot)ru(at)gmail(dot)com> writes:
> >> On Wed, Oct 23, 2013 at 11:03 PM, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com> wrote:
> >>> This is a slightly reworked version of the patch submitted by Richard
> >>> Poole last month, which was based on Christian Kruse's earlier patch.
> >
> >> Is it possible that this patch will be included in a minor version of
> >> 9.3? IMHO hugepages is a very important ability that postgres lost in
> >> 9.3, and it would be great to have it back ASAP.
> >
> > Say what? There's never been any hugepages support in Postgres.
>
> There were an ability to back shared memory with hugepages when using
> <=9.2. I use it on ~30 servers for several years and it brings 8-17%
> of performance depending on the memory size. Here you will find
> several paragraphs of the description about how to do it
> https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md.
> Just search for the 'hugepages' word on the page.

For better or worse, we add new features exactly and only in .0
releases. It's what's made it possible for people to plan
deployments, given us a deserved reputation for stability, etc., etc.

I guess what I'm saying here is that awesome as any particular feature
might be to back-patch, that benefit is overwhelmed by the cost of
having unstable releases.

-infininty from me to any proposal that gets us into "are you using
PostgreSQL x.y.z or x.y.w?" when it comes to features.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: David Fetter <david(at)fetter(dot)org>
To: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org, andres(at)2ndquadrant(dot)com
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-30 07:10:39
Message-ID: 20131030071039.GB13926@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 30, 2013 at 10:16:57AM +0530, Abhijit Menon-Sen wrote:
> At 2013-10-24 16:06:19 +0300, hlinnakangas(at)vmware(dot)com wrote:
> >
> > Let's get rid of the rounding.
>
> I share Andres's concern that the bug is present in various recent
> kernels that are going to stick around for quite some time. Given
> the rather significant performance gain, I think it's worth doing
> something, though I'm not a big fan of the directory-scanning code
> myself.
>
> As a compromise, perhaps we can unconditionally round the size up to be
> a multiple of 2MB?

How about documenting that 2MB is the quantum (OK, we'll say
"indivisible unit" or "smallest division" or something) and failing
with a message to that effect if someone tries to set it otherwise?

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org, andres(at)2ndquadrant(dot)com
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-30 07:28:33
Message-ID: 20131030072833.GG4183@toroid.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

At 2013-10-30 00:10:39 -0700, david(at)fetter(dot)org wrote:
>
> How about documenting that 2MB is the quantum (OK, we'll say
> "indivisible unit" or "smallest division" or something) and failing
> with a message to that effect if someone tries to set it otherwise?

I don't think you understand the problem. We're not discussing a user
setting here. The size that is passed to PGSharedMemoryCreate is based
on shared_buffers and our estimates of how much memory we need for other
things like ProcArray (see ipci.c:CreateSharedMemoryAndSemaphores).

If this calculated size is not a multiple of a page size supported by
the hardware (usually 2/4/16MB etc.), the allocation will fail under
some commonly-used kernels. We can either ignore the problem and let
the allocation fail, or try to discover the smallest supported huge
page size (what the patch does now), or assume that 2MB pages can be
used if any huge pages can be used and align accordingly.

We could use a larger size, e.g. if we aligned to 16MB then it would
work on hardware that supported 2/4/8/16MB pages, but we'd waste the
extra memory unless we also increased NBuffers after the rounding up
(which is also something Andres suggested earlier).

I don't have a strong opinion on the available options, other than not
liking the "do nothing" approach.

-- Abhijit


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org, andres(at)2ndquadrant(dot)com
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-30 15:04:36
Message-ID: 3980.1383145476@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com> writes:
> As a compromise, perhaps we can unconditionally round the size up to be
> a multiple of 2MB? That way, we can use huge pages more often, but also
> avoid putting in a lot of code and effort into the workaround and waste
> only a little space (if any at all).

That sounds reasonably painless to me. Note that at least in our main
shmem segment, "extra" space is not useless, because it allows slop for
the main hash tables, notably the locks table.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
Cc: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, hlinnakangas(at)vmware(dot)com
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-30 15:11:04
Message-ID: 4022.1383145864@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Sergey Konoplev <gray(dot)ru(at)gmail(dot)com> writes:
> On Tue, Oct 29, 2013 at 9:31 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Say what? There's never been any hugepages support in Postgres.

> There were an ability to back shared memory with hugepages when using
> <=9.2. I use it on ~30 servers for several years and it brings 8-17%
> of performance depending on the memory size. Here you will find
> several paragraphs of the description about how to do it
> https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md.

What this describes is how to modify Postgres to request huge pages.
That's hardly built-in support.

In any case, as David already explained, we don't do feature additions
in minor releases. We'd be especially unlikely to make an exception
for this, since it has uncertain portability and benefits. Anything
that carries portability risks has got to go through a beta testing
cycle before we'll unleash it on the masses.

regards, tom lane


From: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org, andres(at)2ndquadrant(dot)com
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-30 17:09:20
Message-ID: 20131030170920.GI4183@toroid.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

At 2013-10-30 11:04:36 -0400, tgl(at)sss(dot)pgh(dot)pa(dot)us wrote:
>
> > As a compromise, perhaps we can unconditionally round the size up to be
> > a multiple of 2MB? […]
>
> That sounds reasonably painless to me.

Here's a patch that does that and adds a DEBUG1 log message when we try
with MAP_HUGETLB and fail and fallback to ordinary mmap.

-- Abhijit

Attachment Content-Type Size
hugepages-v4.patch text/x-diff 9.4 KB

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-30 17:11:34
Message-ID: 20131030171134.GA22214@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-10-30 22:39:20 +0530, Abhijit Menon-Sen wrote:
> At 2013-10-30 11:04:36 -0400, tgl(at)sss(dot)pgh(dot)pa(dot)us wrote:
> >
> > > As a compromise, perhaps we can unconditionally round the size up to be
> > > a multiple of 2MB? […]
> >
> > That sounds reasonably painless to me.
>
> Here's a patch that does that and adds a DEBUG1 log message when we try
> with MAP_HUGETLB and fail and fallback to ordinary mmap.

But it's in no way guaranteed that the smallest hugepage size is
2MB. It'll be on current x86 hardware, but not on any other platform...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, hlinnakangas(at)vmware(dot)com
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-30 17:15:04
Message-ID: CAL_0b1uJFkwcbrnU=yP5PAC5n8HYXjfkZMVrXWwFL938DPVWZg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 30, 2013 at 8:11 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Sergey Konoplev <gray(dot)ru(at)gmail(dot)com> writes:
>> On Tue, Oct 29, 2013 at 9:31 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Say what? There's never been any hugepages support in Postgres.
>
>> There were an ability to back shared memory with hugepages when using
>> <=9.2. I use it on ~30 servers for several years and it brings 8-17%
>> of performance depending on the memory size. Here you will find
>> several paragraphs of the description about how to do it
>> https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md.
>
> What this describes is how to modify Postgres to request huge pages.
> That's hardly built-in support.

I wasn't talking about a built-in support. It was about an ability (a
way) to back sh_buf with hugepages.

> In any case, as David already explained, we don't do feature additions
> in minor releases. We'd be especially unlikely to make an exception
> for this, since it has uncertain portability and benefits. Anything
> that carries portability risks has got to go through a beta testing
> cycle before we'll unleash it on the masses.

Yes, I got the idea. Thanks both of you for clarification.

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray(dot)ru(at)gmail(dot)com


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, hlinnakangas(at)vmware(dot)com
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-30 18:50:18
Message-ID: 20131030185018.GG5922@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Sergey Konoplev escribió:
> On Wed, Oct 30, 2013 at 8:11 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Sergey Konoplev <gray(dot)ru(at)gmail(dot)com> writes:

> >> There were an ability to back shared memory with hugepages when using
> >> <=9.2. I use it on ~30 servers for several years and it brings 8-17%
> >> of performance depending on the memory size. Here you will find
> >> several paragraphs of the description about how to do it
> >> https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md.
> >
> > What this describes is how to modify Postgres to request huge pages.
> > That's hardly built-in support.
>
> I wasn't talking about a built-in support. It was about an ability (a
> way) to back sh_buf with hugepages.

Then what you need is to set
dynamic_shared_memory_type = sysv
in postgresql.conf.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, hlinnakangas(at)vmware(dot)com
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-30 19:13:47
Message-ID: CAL_0b1tvY6iB2z0V9EhUvDPqJYHZscAwBcf9qXRTHUxJLA2Svw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 30, 2013 at 11:50 AM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
>> >> There were an ability to back shared memory with hugepages when using
>> >> <=9.2. I use it on ~30 servers for several years and it brings 8-17%
>> >> of performance depending on the memory size. Here you will find
>> >> several paragraphs of the description about how to do it
>> >> https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md.
>> >
>> > What this describes is how to modify Postgres to request huge pages.
>> > That's hardly built-in support.
>>
>> I wasn't talking about a built-in support. It was about an ability (a
>> way) to back sh_buf with hugepages.
>
> Then what you need is to set
> dynamic_shared_memory_type = sysv
> in postgresql.conf.

Neither I found this parameter in the docs nor it works when I specify
it in postgresql.conf.

LOG: unrecognized configuration parameter
"dynamic_shared_memory_type" in file
"/etc/postgresql/9.3/main/postgresql.conf" line 114
FATAL: configuration file "/etc/postgresql/9.3/main/postgresql.conf"
contains errors

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray(dot)ru(at)gmail(dot)com


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, hlinnakangas(at)vmware(dot)com
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-30 19:17:24
Message-ID: 20131030191724.GH5922@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera escribió:
> Sergey Konoplev escribió:

> > I wasn't talking about a built-in support. It was about an ability (a
> > way) to back sh_buf with hugepages.
>
> Then what you need is to set
> dynamic_shared_memory_type = sysv
> in postgresql.conf.

The above is mistaken -- there's no way to disable the mmap() segment in
9.3, other than recompiling with EXEC_BACKEND which is probably
undesirable for other reasons.

I don't think I had ever heard of that recipe to use huge pages in
previous versions; since the win is probably significant in some
systems, we could have made this configurable.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, hlinnakangas(at)vmware(dot)com
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-30 19:51:17
Message-ID: CAL_0b1sbKHLQfVX5vaYaP8qqte9piO53DOHx9GL96jS35mcU4w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 30, 2013 at 12:17 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
>> > I wasn't talking about a built-in support. It was about an ability (a
>> > way) to back sh_buf with hugepages.
>>
>> Then what you need is to set
>> dynamic_shared_memory_type = sysv
>> in postgresql.conf.
>
> The above is mistaken -- there's no way to disable the mmap() segment in
> 9.3, other than recompiling with EXEC_BACKEND which is probably
> undesirable for other reasons.

Alternatively, I assume it could be linked with libhugetlbfs and you
don't need any source modifications in this case. However I am not
sure it will work with shared memory.

> I don't think I had ever heard of that recipe to use huge pages in
> previous versions; since the win is probably significant in some
> systems, we could have made this configurable.

There are several articles in the web describing how to do this,
except the mine one. And the win becomes mostly significant when you
have 64GB and more on your server.

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray(dot)ru(at)gmail(dot)com


From: Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, hlinnakangas(at)vmware(dot)com
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-10-30 21:28:43
Message-ID: CAL_0b1ucFq4DpGYHXczW2MU+NSpr0d4M3ry=G_uNUb-x5VqSQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 30, 2013 at 12:51 PM, Sergey Konoplev <gray(dot)ru(at)gmail(dot)com> wrote:
> On Wed, Oct 30, 2013 at 12:17 PM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
>>> > I wasn't talking about a built-in support. It was about an ability (a
>>> > way) to back sh_buf with hugepages.
>>>
>>> Then what you need is to set
>>> dynamic_shared_memory_type = sysv
>>> in postgresql.conf.
>>
>> The above is mistaken -- there's no way to disable the mmap() segment in
>> 9.3, other than recompiling with EXEC_BACKEND which is probably
>> undesirable for other reasons.
>
> Alternatively, I assume it could be linked with libhugetlbfs and you
> don't need any source modifications in this case. However I am not
> sure it will work with shared memory.

BTW, I managed to run 9.3 backed with hugepages after I put
HUGETLB_MORECORE (see man libhugetlbfs) to the environment yesterday,
but, after some time of working, it failed with messages showed below.

syslog:

Oct 29 17:53:13 grayhemp kernel: [150579.903875] PID 7584 killed due
to inadequate hugepage pool

postgres:

libhugetlbfslibhugetlbfs2013-10-29 17:53:21 PDT LOG: server process
(PID 7584) was terminated by signal 7: Bus error
2013-10-29 17:53:21 PDT LOG: terminating any other active server processes
2013-10-29 1
7:53:21 PDT WARNING: terminating connection because of crash of
another server process
2013-10-29 17:53:21 PDT DETAIL: The postmaster has commanded this
server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.

My theory is that it has happened after the amount of huge pages
(vm.nr_overcommit_hugepages + vm.nr_hugepages) was exceeded, but I
might be wrong.

Does anybody has some thoughts of why it has happened and how to work abound it?

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray(dot)ru(at)gmail(dot)com


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-11-15 13:17:32
Message-ID: 52861EEC.2090702@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 30.10.2013 19:11, Andres Freund wrote:
> On 2013-10-30 22:39:20 +0530, Abhijit Menon-Sen wrote:
>> At 2013-10-30 11:04:36 -0400, tgl(at)sss(dot)pgh(dot)pa(dot)us wrote:
>>>
>>>> As a compromise, perhaps we can unconditionally round the size up to be
>>>> a multiple of 2MB? […]
>>>
>>> That sounds reasonably painless to me.
>>
>> Here's a patch that does that and adds a DEBUG1 log message when we try
>> with MAP_HUGETLB and fail and fallback to ordinary mmap.
>
> But it's in no way guaranteed that the smallest hugepage size is
> 2MB. It'll be on current x86 hardware, but not on any other platform...

Sure, but there's no big harm done. We're just trying to avoid hitting a
kernel bug, and as a bonus, we avoid wasting some memory that would
otherwise be lost due to the kernel rounding the allocation. If the
smallest hugepage size is smaller than 2MB, we round up the allocation
unnecessarily, but that doesn't seem serious.

I spent some time whacking this around, new patch version attached. I
moved the mmap() code into a new function, that leaves the
PGSharedMemoryCreate more readable.

I modified the patch so that it throws an error if you set
huge_tlb_pages=on, and the platform doesn't support MAP_HUGETLB (ie.
non-Linux, or EXEC_BACKEND). 'try' is the default, so this only affects
you if you explicitly set it to 'on'. I think that's the right behavior;
if you explicitly ask for it, and you don't get it, that should be an
error. But I'm not wedded to the idea if someone objects; a log message
might also be reasonable: "LOG: huge TLB pages are not supported on this
platform, but huge_tlb_pages was 'on'"

The error message on failed allocation, if huge_tlb_pages=on, needs
updating:

$ bin/postmaster -D data
FATAL: could not map anonymous shared memory: Cannot allocate memory
HINT: This error usually means that PostgreSQL's request for a shared
memory segment exceeded available memory or swap space. To reduce the
request size (currently 189390848 bytes), reduce PostgreSQL's shared
memory usage, perhaps by reducing shared_buffers or max_connections.

The reason the allocation failed in this case was that I used
huge_tlb_pages=on, but had not configured the kernel for huge pages. The
hint is quite misleading in that case, it should advise to configure the
kernel, or turn off huge_tlb_pages.

The documentation needs some work. I think it's pretty user-unfriendly
to link to https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt.
It gives a lot of details, and although it explains stuff that is
relevant, like setting the nr_hugepages sysctl, it also contains a lot
of stuff that is not relevant to us, like how to mount hugetlbfs. Can we
do better than that? Is there a better guide somewhere on how to set the
kernel settings. If not, we should include step-by-step instructions in
our manual.

The "Managing Kernel Resources" section in the user manual should also
be updated to mention how to enable huge pages.

Also, now that I changed huge_tlb_pages='on' to fail on platforms where
it's not supported at all, the docs need to be updated to reflect it.

- Heikki

Attachment Content-Type Size
hugepages-v5.patch text/x-diff 11.7 KB

From: Sameer Kumar <sameer(dot)kumar(at)ashnik(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-11-17 04:22:11
Message-ID: CADp-Sm6v1YJkHEg-Xfob5Mkpyc0mPR=XHwcP459+QUQOR-DhOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I was recently running some tests with huge page tables. I ran them on two
different architectures: x86 and PPC64.

I saw some discussion going on over here so thought of sharing.
I was using 3 Cores, 8GB RAM, 2 LUN for filesystem (1 for dbfiles and 1 for
logfiles) for these tests...

I had dedicated
(shared_buffers + 400bytes*max_connection + wal_buffers)/Pagesize [from
/proc/meminfo] for huge pages. I kept some overcommit_hugepages to be used
by work_mem (max_connection*work_mem)/Pagesize

x86_64 bit gave me a benefit of 2-5% for TPC-C workload( I scaled from 1 to
100 users). PPC64 which uses 16MB and 64MB did not give me any benefits in
fact the performance degraded as the concurrency of system increased.

my 2 cents, hope it helps.


From: Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-11-18 07:29:52
Message-ID: 20131118072952.GA17956@toroid.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

At 2013-11-15 15:17:32 +0200, hlinnakangas(at)vmware(dot)com wrote:
>
> I spent some time whacking this around, new patch version attached.

Thanks.

> But I'm not wedded to the idea if someone objects; a log message might
> also be reasonable: "LOG: huge TLB pages are not supported on this
> platform, but huge_tlb_pages was 'on'"

Put that way, I have to wonder if the right thing to do is just to have
a "try_huge_pages=on|off" setting, and log a warning if the attempt did
not succeed. It would be easier to document, and I don't think there's
much point in making it an error if the allocation fails.

-- Abhijit

P.S. I'd be happy to do the followup work for this patch (updating
documentation, etc.), but it'll have to wait until I recover from
this !#$&@! stomach bug.


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-11-21 21:09:38
Message-ID: 20131121210938.GE6041@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Abhijit Menon-Sen wrote:
> At 2013-11-15 15:17:32 +0200, hlinnakangas(at)vmware(dot)com wrote:

> > But I'm not wedded to the idea if someone objects; a log message might
> > also be reasonable: "LOG: huge TLB pages are not supported on this
> > platform, but huge_tlb_pages was 'on'"
>
> Put that way, I have to wonder if the right thing to do is just to have
> a "try_huge_pages=on|off" setting, and log a warning if the attempt did
> not succeed. It would be easier to document, and I don't think there's
> much point in making it an error if the allocation fails.

What about
huge_tlb_pages={off,try}

Or maybe
huge_tlb_pages={off,try,require}

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-11-21 21:14:35
Message-ID: 20131121211435.GA14939@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-11-21 18:09:38 -0300, Alvaro Herrera wrote:
> Abhijit Menon-Sen wrote:
> > At 2013-11-15 15:17:32 +0200, hlinnakangas(at)vmware(dot)com wrote:
>
> > > But I'm not wedded to the idea if someone objects; a log message might
> > > also be reasonable: "LOG: huge TLB pages are not supported on this
> > > platform, but huge_tlb_pages was 'on'"
> >
> > Put that way, I have to wonder if the right thing to do is just to have
> > a "try_huge_pages=on|off" setting, and log a warning if the attempt did
> > not succeed. It would be easier to document, and I don't think there's
> > much point in making it an error if the allocation fails.
>
> What about
> huge_tlb_pages={off,try}
>
> Or maybe
> huge_tlb_pages={off,try,require}

I'd certainly want a setting that errors out if it cannot get the memory
using hugetables. If you rely on the reduction in memory (which can be
significant on large s_b, large max_connections), it's rather annoying
not to know whether it suceeded using it.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-11-21 21:24:56
Message-ID: CA+TgmobvG7D0aetpo56aSUV00FF0aTWUAXh7j5tS9BJ=YHF+yA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 21, 2013 at 4:09 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Abhijit Menon-Sen wrote:
>> At 2013-11-15 15:17:32 +0200, hlinnakangas(at)vmware(dot)com wrote:
>
>> > But I'm not wedded to the idea if someone objects; a log message might
>> > also be reasonable: "LOG: huge TLB pages are not supported on this
>> > platform, but huge_tlb_pages was 'on'"
>>
>> Put that way, I have to wonder if the right thing to do is just to have
>> a "try_huge_pages=on|off" setting, and log a warning if the attempt did
>> not succeed. It would be easier to document, and I don't think there's
>> much point in making it an error if the allocation fails.
>
> What about
> huge_tlb_pages={off,try}
>
> Or maybe
> huge_tlb_pages={off,try,require}

I'd spell "require" as "on", or at least accept that as a synonym.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-11-21 21:58:11
Message-ID: 20131121215811.GG27838@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-11-21 16:24:56 -0500, Robert Haas wrote:
> > What about
> > huge_tlb_pages={off,try}
> >
> > Or maybe
> > huge_tlb_pages={off,try,require}
>
> I'd spell "require" as "on", or at least accept that as a synonym.

That's off,try, on is what the patch currently implements, Abhijit just
was arguing for dropping the error-out option.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2013-11-25 03:29:20
Message-ID: 20131125032920.GA23793@toroid.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

At 2013-11-21 22:14:35 +0100, andres(at)2ndquadrant(dot)com wrote:
>
> I'd certainly want a setting that errors out if it cannot get the
> memory using hugetables.

OK, then the current try/on/off settings are fine.

I'm better today, so I'll read the patch Heikki posted and see what more
needs to be done there.

-- Abhijit


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-01-27 19:20:23
Message-ID: 20140127192023.GJ10723@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas wrote:

> I spent some time whacking this around, new patch version attached.
> I moved the mmap() code into a new function, that leaves the
> PGSharedMemoryCreate more readable.

Did this patch go anywhere?

Someone just pinged me about a kernel scalability problem in Linux with
huge pages; if someone did performance measurements with this patch,
perhaps it'd be good to measure again with the kernel patch in place.

https://lkml.org/lkml/2014/1/26/227

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-01-28 11:51:03
Message-ID: 52E799A7.8080306@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/27/2014 09:20 PM, Alvaro Herrera wrote:
> Heikki Linnakangas wrote:
>
>> I spent some time whacking this around, new patch version attached.
>> I moved the mmap() code into a new function, that leaves the
>> PGSharedMemoryCreate more readable.
>
> Did this patch go anywhere?

Oh darn, I remembered we had already committed this, but clearly not.
I'd love to still get this into 9.4. The latest patch
(hugepages-v5.patch) was pretty much ready for commit, except for
documentation.

- Heikki


From: Christian Kruse <christian(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-01-28 11:58:23
Message-ID: 20140128115823.GA18499@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 28/01/14 13:51, Heikki Linnakangas wrote:
> Oh darn, I remembered we had already committed this, but clearly not. I'd
> love to still get this into 9.4. The latest patch (hugepages-v5.patch) was
> pretty much ready for commit, except for documentation.

I'm working on it. I ported it to HEAD and currently doing some
benchmarks. Next will be documentation.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Christian Kruse <christian(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-01-28 13:12:59
Message-ID: 20140128131259.GB24091@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 15/11/13 15:17, Heikki Linnakangas wrote:
> I spent some time whacking this around, new patch version attached. I moved
> the mmap() code into a new function, that leaves the PGSharedMemoryCreate
> more readable.

I think there's a bug in this version of the patch. Have a look at
this:

+ if (huge_tlb_pages == HUGE_TLB_ON || huge_tlb_pages == HUGE_TLB_TRY)
+ {
[…]
+ ptr = mmap(NULL, *size, PROT_READ | PROT_WRITE,
+ PG_MMAP_FLAGS | MAP_HUGETLB, -1, 0);
[…]
+ }
+#endif
+
+ if (huge_tlb_pages == HUGE_TLB_OFF || huge_tlb_pages == HUGE_TLB_TRY)
+ {
+ allocsize = *size;
+ ptr = mmap(NULL, *size, PROT_READ | PROT_WRITE, PG_MMAP_FLAGS, -1, 0);
+ }

This will lead to a duplicate mmap() if hugepages work and
huge_tlb_pages == HUGE_TLB_TRY, or am I missing something?
I think it should be like this:

if (huge_tlb_pages == HUGE_TLB_OFF ||
(huge_tlb_pages == HUGE_TLB_TRY && ptr == MAP_FAILED))

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Christian Kruse <christian(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-01-28 16:11:11
Message-ID: 20140128161111.GE24091@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

attached you will find a new version of the patch, ported to HEAD,
fixed the mentioned bug and - hopefully - dealing the the remaining
issues.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
hugepages-v6.patch text/x-diff 11.9 KB

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Christian Kruse <christian(at)2ndQuadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-01-29 12:12:37
Message-ID: 52E8F035.8080405@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/28/2014 06:11 PM, Christian Kruse wrote:
> Hi,
>
> attached you will find a new version of the patch, ported to HEAD,
> fixed the mentioned bug and - hopefully - dealing the the remaining
> issues.

Thanks, I have committed this now.

The documentation is still lacking. We should explain somewhere how to
set nr.hugepages, for example. The "Managing Kernel Resources" section
ought to mention setting. Could I ask you to work on that, please?

- Heikki


From: Vik Fearing <vik(dot)fearing(at)dalibo(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Christian Kruse <christian(at)2ndQuadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-01-29 14:01:08
Message-ID: 52E909A4.2010804@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/29/2014 01:12 PM, Heikki Linnakangas wrote:
> On 01/28/2014 06:11 PM, Christian Kruse wrote:
>> Hi,
>>
>> attached you will find a new version of the patch, ported to HEAD,
>> fixed the mentioned bug and - hopefully - dealing the the remaining
>> issues.
>
> Thanks, I have committed this now.
>
> The documentation is still lacking.
>

The documentation is indeed lacking since it breaks the build.

doc/src/sgml/config.sgml contains the line

normal allocation if that fails. With <literal>on</literal, failure

which doesn't correctly terminate the closing </literal> tag.

Trivial patch attached.

--
Vik

Attachment Content-Type Size
fix_tag.patch text/x-diff 938 bytes

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Vik Fearing <vik(dot)fearing(at)dalibo(dot)com>
Cc: Christian Kruse <christian(at)2ndQuadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-01-29 14:18:41
Message-ID: 52E90DC1.4040607@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/29/2014 04:01 PM, Vik Fearing wrote:
> On 01/29/2014 01:12 PM, Heikki Linnakangas wrote:
>> The documentation is still lacking.
>
> The documentation is indeed lacking since it breaks the build.
>
> doc/src/sgml/config.sgml contains the line
>
> normal allocation if that fails. With <literal>on</literal, failure
>
> which doesn't correctly terminate the closing </literal> tag.
>
> Trivial patch attached.

Thanks, applied!

- Heikki


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Christian Kruse <christian(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-01-29 14:40:47
Message-ID: CAHyXU0wi5TRboWTcpR-rdJNt=3SFrKWbMXd0pKt8KrZ4TXr6Lw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 28, 2014 at 5:58 AM, Christian Kruse
<christian(at)2ndquadrant(dot)com> wrote:
> Hi,
>
> On 28/01/14 13:51, Heikki Linnakangas wrote:
>> Oh darn, I remembered we had already committed this, but clearly not. I'd
>> love to still get this into 9.4. The latest patch (hugepages-v5.patch) was
>> pretty much ready for commit, except for documentation.
>
> I'm working on it. I ported it to HEAD and currently doing some
> benchmarks. Next will be documentation.

you mentioned benchmarks -- do you happen to have the results handy? (curious)

merlin


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Christian Kruse <christian(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-01-29 18:11:28
Message-ID: CAMkU=1zrMo3GJU_i3fHauf1S8Fj1pGHmEuDyBpTa5fh9PTtnwQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 29, 2014 at 4:12 AM, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com
> wrote:

> On 01/28/2014 06:11 PM, Christian Kruse wrote:
>
>> Hi,
>>
>> attached you will find a new version of the patch, ported to HEAD,
>> fixed the mentioned bug and - hopefully - dealing the the remaining
>> issues.
>>
>
> Thanks, I have committed this now.
>

I'm getting this warning now with gcc (GCC) 4.4.7:

pg_shmem.c: In function 'PGSharedMemoryCreate':
pg_shmem.c:332: warning: 'allocsize' may be used uninitialized in this
function
pg_shmem.c:332: note: 'allocsize' was declared here

Cheers,

Jeff


From: Christian Kruse <christian(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-01-29 19:09:18
Message-ID: 20140129190918.GA31325@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 29/01/14 14:12, Heikki Linnakangas wrote:
> The documentation is still lacking. We should explain somewhere how to set
> nr.hugepages, for example. The "Managing Kernel Resources" section ought to
> mention setting. Could I ask you to work on that, please?

Of course! Attached you will find a patch for better documentation.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
huge_tlb_docs.patch text/x-diff 4.4 KB

From: Christian Kruse <christian(at)2ndquadrant(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-01-29 19:18:11
Message-ID: 20140129191811.GB31325@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 29/01/14 10:11, Jeff Janes wrote:
> I'm getting this warning now with gcc (GCC) 4.4.7:

Interesting. I don't get that warning. But the compiler is (formally)
right.

> pg_shmem.c: In function 'PGSharedMemoryCreate':
> pg_shmem.c:332: warning: 'allocsize' may be used uninitialized in this
> function
> pg_shmem.c:332: note: 'allocsize' was declared here

Attached patch should fix that.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
hugepages-v7.patch text/x-diff 895 bytes

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Christian Kruse <christian(at)2ndquadrant(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-01-29 19:36:55
Message-ID: 52E95857.30503@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/29/2014 09:18 PM, Christian Kruse wrote:
> Hi,
>
> On 29/01/14 10:11, Jeff Janes wrote:
>> I'm getting this warning now with gcc (GCC) 4.4.7:
>
> Interesting. I don't get that warning. But the compiler is (formally)
> right.
>
>> pg_shmem.c: In function 'PGSharedMemoryCreate':
>> pg_shmem.c:332: warning: 'allocsize' may be used uninitialized in this
>> function
>> pg_shmem.c:332: note: 'allocsize' was declared here

Hmm, I didn't get that warning either.

> Attached patch should fix that.

That's not quite right. If the first mmap() fails, allocsize is set to
the rounded-up size, but the second mmap() uses the original size for
the allocation. So it returns a too high value to the caller.

Ugh, it's actually broken anyway :-(. The first allocation also passes
*size to mmap(), so the calculated rounded-up allocsize value is not
used for anything.

Fix pushed.

- Heikki


From: Christian Kruse <christian(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-01-29 19:59:30
Message-ID: 20140129195930.GD31325@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 29/01/14 21:36, Heikki Linnakangas wrote:
> […]
> Fix pushed.

You are right. Thanks. But there is another bug, see

<20140128154307(dot)GC24091(at)defunct(dot)ch>

ff. Attached you will find a patch fixing that.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
hugepages-v8.patch text/x-diff 841 bytes

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Christian Kruse <christian(at)2ndquadrant(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-01-29 20:17:22
Message-ID: 52E961D2.9060504@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/29/2014 09:59 PM, Christian Kruse wrote:
> Hi,
>
> On 29/01/14 21:36, Heikki Linnakangas wrote:
>> […]
>> Fix pushed.
>
> You are right. Thanks. But there is another bug, see
>
> <20140128154307(dot)GC24091(at)defunct(dot)ch>
>
> ff. Attached you will find a patch fixing that.

Thanks. There are more cases of that in InternalIpcMemoryCreate, they
ought to be fixed as well. And should also grep the rest of the codebase
for more instances of that. And this needs to be back-patched.

- Heikki


From: Christian Kruse <christian(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-01-29 20:22:45
Message-ID: 20140129202245.GA11341@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 29/01/14 22:17, Heikki Linnakangas wrote:
> Thanks. There are more cases of that in InternalIpcMemoryCreate, they ought
> to be fixed as well. And should also grep the rest of the codebase for more
> instances of that. And this needs to be back-patched.

I'm way ahead of you ;-) Working on it.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Christian Kruse <christian(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-01-30 07:28:59
Message-ID: 20140130072859.GA3557@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

after I finally got documentation compilation working I updated the
patch to be syntactically correct. You will find it attached.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
huge_tlb_docs-v1.patch text/x-diff 4.4 KB

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Christian Kruse <christian(at)2ndQuadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-25 15:29:32
Message-ID: 530CB6DC.6010205@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/30/14, 2:28 AM, Christian Kruse wrote:
> after I finally got documentation compilation working I updated the
> patch to be syntactically correct. You will find it attached.

I don't think we should be explaining the basics of OS memory management
in our documentation. And if we did, we shouldn't copy it verbatim from
the Debian wiki without attribution.

I think this patch should be cut down to the paragraphs that cover the
actual configuration.

On a technical note, use <xref> instead of <link> for linking.
doc/src/sgml/README.links contains some information.


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Christian Kruse <christian(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-25 15:53:44
Message-ID: CA+Tgmoa7bikoQzuJSDFH3N0qkgAG4BT_eWh38Fmh=WH_8dWJHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 25, 2014 at 10:29 AM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> And if we did, we shouldn't copy it verbatim from
> the Debian wiki without attribution.

That is seriously not cool.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Christian Kruse <christian(at)2ndQuadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-25 16:01:30
Message-ID: 20140225160130.GR6718@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-02-25 10:29:32 -0500, Peter Eisentraut wrote:
> On 1/30/14, 2:28 AM, Christian Kruse wrote:
> > after I finally got documentation compilation working I updated the
> > patch to be syntactically correct. You will find it attached.
>
> I don't think we should be explaining the basics of OS memory management
> in our documentation.

Agreed.

> And if we did, we shouldn't copy it verbatim from the Debian wiki
> without attribution.

Is it actually? A quick comparison doesn't show that many similarities?
Christian?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Christian Kruse <christian(at)2ndQuadrant(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-25 16:02:13
Message-ID: 20140225160213.GG1400@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 25/02/14 10:29, Peter Eisentraut wrote:
> I don't think we should be explaining the basics of OS memory management
> in our documentation.

Well, I'm confused. I thought that's exactly what has been asked.

> And if we did, we shouldn't copy it verbatim from the Debian wiki
> without attribution.

I didn't. This is a write-up of several articles, blog posts and
documentation I read about this topic.

However, if you think the texts are too similar, then we should add a
note, yes. Didn't mean to copy w/o referring to a source.

> I think this patch should be cut down to the paragraphs that cover the
> actual configuration.

I tried to cover the issues Heikki brought up in
<52861EEC(dot)2090702(at)vmware(dot)com>.

> On a technical note, use <xref> instead of <link> for linking.
> doc/src/sgml/README.links contains some information.

OK, I will post an updated patch later this evening.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Christian Kruse <christian(at)2ndQuadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-25 16:08:06
Message-ID: 20140225160806.GH1400@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 25/02/14 17:01, Andres Freund wrote:
> > And if we did, we shouldn't copy it verbatim from the Debian wiki
> > without attribution.
>
> Is it actually? A quick comparison doesn't show that many similarities?
> Christian?

Not as far as I know. But of course, as I wrote the text I _also_
(that's not my only source) read the Debian article and I was
influenced by it. It may be that the texts are more similar then I
thought, although I still don't see it.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Christian Kruse <christian(at)2ndQuadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-25 17:18:02
Message-ID: 530CD04A.6020508@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2/25/14, 11:08 AM, Christian Kruse wrote:
> Hi,
>
> On 25/02/14 17:01, Andres Freund wrote:
>>> And if we did, we shouldn't copy it verbatim from the Debian wiki
>>> without attribution.
>>
>> Is it actually? A quick comparison doesn't show that many similarities?
>> Christian?
>
> Not as far as I know. But of course, as I wrote the text I _also_
> (that's not my only source) read the Debian article and I was
> influenced by it. It may be that the texts are more similar then I
> thought, although I still don't see it.

I suspect that it was done subconsciously. But I did notice it right
away, so there is something to it.

As I mentioned, I would just cut those introductory parts out.


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Christian Kruse <christian(at)2ndQuadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-25 17:39:34
Message-ID: 20140225173934.GA1507@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 25, 2014 at 12:18:02PM -0500, Peter Eisentraut wrote:
> On 2/25/14, 11:08 AM, Christian Kruse wrote:
> > Hi,
> >
> > On 25/02/14 17:01, Andres Freund wrote:
> >>> And if we did, we shouldn't copy it verbatim from the Debian wiki
> >>> without attribution.
> >>
> >> Is it actually? A quick comparison doesn't show that many similarities?
> >> Christian?
> >
> > Not as far as I know. But of course, as I wrote the text I _also_
> > (that's not my only source) read the Debian article and I was
> > influenced by it. It may be that the texts are more similar then I
> > thought, although I still don't see it.
>
> I suspect that it was done subconsciously. But I did notice it right
> away, so there is something to it.
>
> As I mentioned, I would just cut those introductory parts out.

Should we link to the Debian wiki content?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Christian Kruse <christian(at)2ndQuadrant(dot)com>, Andres Freund <andres(at)2ndQuadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-25 18:21:46
Message-ID: 3758.1393352506@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> On Tue, Feb 25, 2014 at 12:18:02PM -0500, Peter Eisentraut wrote:
>> As I mentioned, I would just cut those introductory parts out.

> Should we link to the Debian wiki content?

-1. We generally don't link to our *own* wiki in our SGML docs, let alone
things that aren't even under our project's control. Moreover, Debian
is not going to be explaining these things in a way that accounts for
non-Linux operating systems.

regards, tom lane


From: Andres Freund <andres(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Christian Kruse <christian(at)2ndQuadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-25 18:28:07
Message-ID: 20140225182807.GU6718@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-02-25 13:21:46 -0500, Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > On Tue, Feb 25, 2014 at 12:18:02PM -0500, Peter Eisentraut wrote:
> >> As I mentioned, I would just cut those introductory parts out.
>
> > Should we link to the Debian wiki content?
>
> -1. We generally don't link to our *own* wiki in our SGML docs, let alone
> things that aren't even under our project's control.

Agreed. Especially as the interesting bit is the postgres specific
logic, not the rest.

I think all that's needed is to cut the first paragraphs that generally
explain what huge pages are in some detail from the text and make sure
the later paragraphs don't refer to the earlier ones.

> Moreover, Debian
> is not going to be explaining these things in a way that accounts for
> non-Linux operating systems.

It's a linux only feature so far, so that alone wouldn't be a problem.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Christian Kruse <christian(at)2ndQuadrant(dot)com>
To: Andres Freund <andres(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-26 08:35:30
Message-ID: 20140226083530.GJ1400@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 25/02/14 19:28, Andres Freund wrote:
> I think all that's needed is to cut the first paragraphs that generally
> explain what huge pages are in some detail from the text and make sure
> the later paragraphs don't refer to the earlier ones.

Attached you will find a new version of the patch.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
huge_tlb_docs-v2.patch text/x-diff 3.2 KB

From: Christian Kruse <christian(at)2ndQuadrant(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-26 10:38:35
Message-ID: 20140226103835.GA16851@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Peter,

after a night of sleep I'm still not able to swallow the pill. To be
honest I'm a little bit angry about this accusation.

I didn't mean to copy from the Debian wiki and after re-checking the
text again I'm still convinced that I didn't.

Of course the text SAYS something similar, but this is in the nature
of things. Structure, diction and focus are different. Also the
information transferred is different and gathered from various
articles, including the Debian wiki, the huge page docs of the kernel,
the Wikipedia and some old IBM and Oracle docs.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Christian Kruse <christian(at)2ndQuadrant(dot)com>
Cc: Andres Freund <andres(at)2ndQuadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-26 12:34:43
Message-ID: 530DDF63.7020700@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/26/2014 10:35 AM, Christian Kruse wrote:
> On 25/02/14 19:28, Andres Freund wrote:
>> I think all that's needed is to cut the first paragraphs that generally
>> explain what huge pages are in some detail from the text and make sure
>> the later paragraphs don't refer to the earlier ones.
>
> Attached you will find a new version of the patch.

Thanks!

> huge_tlb_pages (enum)
>
> Enables/disables the use of huge TLB pages. Valid values are try (the default), on, and off.
>
> At present, this feature is supported only on Linux. The setting is ignored on other systems.
>
> The use of huge TLB pages results in smaller page tables and less CPU time spent on memory management, increasing performance. For more details, see Section 17.4.4.
>
> With huge_tlb_pages set to try, the server will try to use huge pages, but fall back to using normal allocation if that fails. With on, failure to use huge pages will prevent the server from starting up. With off, huge pages will not be used.

That still says "The setting is ignored on other systems". That's not
quite true: as explained later in the section, if you set
huge_tlb_pages=on and the platform doesn't support it, the server will
refuse to start.

> 17.4.4. Linux huge TLB pages

This section looks good to me. I'm OK with the level of detail, although
maybe just a sentence or two about what huge TLB pages are and what
benefits they have would still be in order. How about adding something
like this as the first sentence:

"Using huge TLB pages reduces overhead when using large contiguous
chunks of memory, like PostgreSQL does."

> To enable this feature in PostgreSQL you need a kernel with CONFIG_HUGETLBFS=y and CONFIG_HUGETLB_PAGE=y. You also have to tune the system setting vm.nr_hugepages. To calculate the number of necessary huge pages start PostgreSQL without huge pages enabled and check the VmPeak value from the proc filesystem:
>
> $ head -1 /path/to/data/directory/postmaster.pid
> 4170
> $ grep ^VmPeak /proc/4170/status
> VmPeak: 6490428 kB
>
> 6490428 / 2048 (PAGE_SIZE is 2MB in this case) are roughly 3169.154 huge pages, so you will need at least 3170 huge pages:
>
> $ sysctl -w vm.nr_hugepages=3170

That's good advice, but perhaps s/calculate/estimate/. It's just an
approximation, after all.

- Heikki


From: Christian Kruse <christian(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andres Freund <andres(at)2ndQuadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-26 14:25:24
Message-ID: 20140226142524.GK1400@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 26/02/14 14:34, Heikki Linnakangas wrote:
> That still says "The setting is ignored on other systems". That's not quite
> true: as explained later in the section, if you set huge_tlb_pages=on and
> the platform doesn't support it, the server will refuse to start.

I added a sentence about it.

> "Using huge TLB pages reduces overhead when using large contiguous chunks of
> memory, like PostgreSQL does."

Sentence added.

> That's good advice, but perhaps s/calculate/estimate/. It's just an
> approximation, after all.

Fixed.

New patch version is attached.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
huge_tlb_docs-v3.patch text/x-diff 3.7 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Christian Kruse <christian(at)2ndQuadrant(dot)com>, Andres Freund <andres(at)2ndQuadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-26 16:13:02
Message-ID: 20140226161302.GD4759@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


There's one thing that rubs me the wrong way about all this
functionality, which is that we've named it "huge TLB pages". That is
wrong -- the TLB pages are not huge. In fact, as far as I understand,
the TLB doesn't have pages at all. It's the pages that are huge, but
those pages are not TLB pages, they are just memory pages.

I think we have named it this way only because Linux for some reason
named the mmap() flag MAP_HUGETLB for some reason. The TLB is not huge
either (in fact you can't alter the size of the TLB at all; it's a
hardware thing.) I think this flag means "use the TLB entries reserved
for huge pages for the memory I'm requesting".

Since we haven't released any of this, should we discuss renaming it to
just "huge pages"?

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Christian Kruse <christian(at)2ndQuadrant(dot)com>, Andres Freund <andres(at)2ndQuadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-26 17:56:44
Message-ID: 530E2ADC.9040001@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/26/2014 06:13 PM, Alvaro Herrera wrote:
>
> There's one thing that rubs me the wrong way about all this
> functionality, which is that we've named it "huge TLB pages". That is
> wrong -- the TLB pages are not huge. In fact, as far as I understand,
> the TLB doesn't have pages at all. It's the pages that are huge, but
> those pages are not TLB pages, they are just memory pages.
>
> I think we have named it this way only because Linux for some reason
> named the mmap() flag MAP_HUGETLB for some reason. The TLB is not huge
> either (in fact you can't alter the size of the TLB at all; it's a
> hardware thing.) I think this flag means "use the TLB entries reserved
> for huge pages for the memory I'm requesting".
>
> Since we haven't released any of this, should we discuss renaming it to
> just "huge pages"?

Linux calls it "huge tlb pages" in many places, not just MAP_HUGETLB.
Like in CONFIG_HUGETLB_PAGES and hugetlbfs. I agree it's a bit weird.
Linux also calls it just "huge pages" in many other places, like in
/proc/meminfo output.

FreeBSD calls them superpages and Windows calls them "large pages".
Yeah, it would seem better to call them just "huge pages", so that it's
more reminiscent of those names, if we ever implement support for
super/huge/large pages on other platforms.

- Heikki


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Christian Kruse <christian(at)2ndQuadrant(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-26 19:37:24
Message-ID: 20140226193724.GL2921@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Christian,

Thanks for working on all of this and dealing with the requests for
updates and changes, as well as for dealing very professionally with an
inappropriate and incorrect remark. Unfortunately, mailing lists can
make communication difficult and someone's knee-jerk reaction (not
referring to your reaction here) can end up causing much frustration.

Remind me when we're at a conference somewhere and I'll gladly buy you a
beer (or whatever your choice is). Seriously, thanks for working on the
'huge pages' changes and documentation- it's often a thankless job and
clearly one which can be extremely frustrating.

Thanks again,

Stephen


From: Christian Kruse <christian(at)2ndQuadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andres Freund <andres(at)2ndQuadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-27 07:34:48
Message-ID: 20140227073448.GA24373@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 26/02/14 13:13, Alvaro Herrera wrote:
>
> There's one thing that rubs me the wrong way about all this
> functionality, which is that we've named it "huge TLB pages". That is
> wrong -- the TLB pages are not huge. In fact, as far as I understand,
> the TLB doesn't have pages at all. It's the pages that are huge, but
> those pages are not TLB pages, they are just memory pages.

I didn't think about this, yet, but you are totally right.

> Since we haven't released any of this, should we discuss renaming it to
> just "huge pages"?

Attached is a patch with the updated documentation (now uses
consistently huge pages) as well as a renamed GUC, consistent wording
(always use huge pages) as well as renamed variables.

Should I create a new commit fest entry for this and delete the old
one? Or should this be done in two patches? Locally in my repo this is
done with two commits, so it would be easy to split that.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
huge_tlb_docs_with_renamed_guc_and_variables.patch text/x-diff 10.4 KB

From: Christian Kruse <christian(at)2ndQuadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-27 07:35:32
Message-ID: 20140227073532.GB24373@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Peter,

thank you for your nice words, much appreciated. I'm sorry that I was
so whiny about this in the last post.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Christian Kruse <christian(at)2ndQuadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-27 08:08:33
Message-ID: 20140227080833.GA8142@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 27/02/14 08:35, Christian Kruse wrote:
> Hi Peter,

Sorry, Stephen of course – it was definitely to early.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Christian Kruse <christian(at)2ndQuadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndQuadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-02-28 17:43:56
Message-ID: 5310CADC.1000209@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/27/2014 09:34 AM, Christian Kruse wrote:
> Hi,
>
> On 26/02/14 13:13, Alvaro Herrera wrote:
>>
>> There's one thing that rubs me the wrong way about all this
>> functionality, which is that we've named it "huge TLB pages". That is
>> wrong -- the TLB pages are not huge. In fact, as far as I understand,
>> the TLB doesn't have pages at all. It's the pages that are huge, but
>> those pages are not TLB pages, they are just memory pages.
>
> I didn't think about this, yet, but you are totally right.
>
>> Since we haven't released any of this, should we discuss renaming it to
>> just "huge pages"?
>
> Attached is a patch with the updated documentation (now uses
> consistently huge pages) as well as a renamed GUC, consistent wording
> (always use huge pages) as well as renamed variables.

Hmm, I wonder if that could now be misunderstood to have something to do
with the PostgreSQL page size? Maybe add the word "memory" or "operating
system" in the first sentence in the docs, like this: "Enables/disables
the use of huge memory pages".

> <para>
> At present, this feature is supported only on Linux. The setting is
> ignored on other systems when set to <literal>try</literal>.
> <productname>PostgreSQL</productname> will
> refuse to start when set to <literal>on</literal>.
> </para>

Is it clear enough that PostgreSQL will only refuse to start up when
it's set to on, *if the feature's not supported on the platform*?
Perhaps just leave that last sentence out. It's mentioned later that "
With <literal>on</literal>, failure to use huge pages will prevent the
server from starting up.", that's probably enough.

- Heikki


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Christian Kruse <christian(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-03-01 01:58:04
Message-ID: CAM3SWZTfBLqGBTGHdU5OP3PdaQrkndpMM2cc++zBgj-iZRGtRQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 28, 2014 at 9:43 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> Hmm, I wonder if that could now be misunderstood to have something to do
> with the PostgreSQL page size? Maybe add the word "memory" or "operating
> system" in the first sentence in the docs, like this: "Enables/disables the
> use of huge memory pages".

Whenever I wish to emphasize that distinction, I tend to use the term
"MMU pages".

--
Peter Geoghegan


From: Christian Kruse <christian(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndQuadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-03-03 09:34:23
Message-ID: 20140303093423.GC20834@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

> >Attached is a patch with the updated documentation (now uses
> >consistently huge pages) as well as a renamed GUC, consistent wording
> >(always use huge pages) as well as renamed variables.
>
> Hmm, I wonder if that could now be misunderstood to have something to do
> with the PostgreSQL page size? Maybe add the word "memory" or "operating
> system" in the first sentence in the docs, like this: "Enables/disables the
> use of huge memory pages".

Accepted, see attached patch.

> > <para>
> > At present, this feature is supported only on Linux. The setting is
> > ignored on other systems when set to <literal>try</literal>.
> > <productname>PostgreSQL</productname> will
> > refuse to start when set to <literal>on</literal>.
> > </para>
>
> Is it clear enough that PostgreSQL will only refuse to start up when it's
> set to on, *if the feature's not supported on the platform*? Perhaps just
> leave that last sentence out. It's mentioned later that " With
> <literal>on</literal>, failure to use huge pages will prevent the server
> from starting up.", that's probably enough.

Fixed.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
hugepages-v9.patch text/x-diff 9.1 KB

From: Christian Kruse <christian(at)2ndquadrant(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-03-03 09:37:35
Message-ID: 20140303093734.GD20834@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 28/02/14 17:58, Peter Geoghegan wrote:
> On Fri, Feb 28, 2014 at 9:43 AM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
> > Hmm, I wonder if that could now be misunderstood to have something to do
> > with the PostgreSQL page size? Maybe add the word "memory" or "operating
> > system" in the first sentence in the docs, like this: "Enables/disables the
> > use of huge memory pages".
>
> Whenever I wish to emphasize that distinction, I tend to use the term
> "MMU pages".

I don't like to distinct that much from Linux terminology, this may
lead to confusion. And to use this term only in one place doesn't seem
to make sense, too – naming will then be inconsistent and thus lead to
confusion, too. Do you agree?

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Christian Kruse <christian(at)2ndQuadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndQuadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-03-03 19:03:06
Message-ID: 5314D1EA.8020007@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/03/2014 11:34 AM, Christian Kruse wrote:
> Hi,
>
>>> Attached is a patch with the updated documentation (now uses
>>> consistently huge pages) as well as a renamed GUC, consistent wording
>>> (always use huge pages) as well as renamed variables.
>>
>> Hmm, I wonder if that could now be misunderstood to have something to do
>> with the PostgreSQL page size? Maybe add the word "memory" or "operating
>> system" in the first sentence in the docs, like this: "Enables/disables the
>> use of huge memory pages".
>
> Accepted, see attached patch.

Thanks, committed!

I spotted this in section "17.4.1 Shared Memory and Semaphores":

> Linux
>
> The default maximum segment size is 32 MB, and the default maximum total size is 2097152 pages. A page is almost always 4096 bytes except in unusual kernel configurations with "huge pages" (use getconf PAGE_SIZE to verify).

It's not any more wrong now than it's always been, but I don't think
huge pages ever affect PAGE_SIZE... Could I cajole you into rephrasing
that, too?

- Heikki


From: Christian Kruse <christian(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndQuadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Date: 2014-03-04 10:53:17
Message-ID: 20140304105317.GB3754@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 03/03/14 21:03, Heikki Linnakangas wrote:
> I spotted this in section "17.4.1 Shared Memory and Semaphores":
>
> >Linux
> >
> > The default maximum segment size is 32 MB, and the default maximum total size is 2097152 pages. A page is almost always 4096 bytes except in unusual kernel configurations with "huge pages" (use getconf PAGE_SIZE to verify).
>
> It's not any more wrong now than it's always been, but I don't think huge
> pages ever affect PAGE_SIZE... Could I cajole you into rephrasing that, too?

Hm… to be honest, I'm not sure how to change that. What about this?

The default maximum segment size is 32 MB, and the
default maximum total size is 2097152
pages. A page is almost always 4096 bytes except in
kernel configurations with <quote>huge pages</quote>
(use <literal>cat /proc/meminfo | grep Hugepagesize</literal>
to verify), but they have to be enabled explicitely via
<xref linkend="guc-huge-pages">. See
<xref linkend="linux-huge-pages"> for details.

I attached a patch doing this change.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
shm_docs-v1.patch text/x-diff 857 bytes