Re: MMAP Buffers

From: Radosław Smogura <rsmogura(at)softperience(dot)eu>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <gsstark(at)mit(dot)edu>, Greg Smith <greg(at)2ndquadrant(dot)com>, Joshua Berkus <josh(at)agliodbs(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MMAP Buffers
Date: 2011-04-17 21:32:18
Message-ID: 201104172332.18381.rsmogura@softperience.eu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> Sunday 17 April 2011 22:01:55
> On Sun, Apr 17, 2011 at 11:48 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > =?utf-8?q?Rados=C5=82aw_Smogura?= <rsmogura(at)softperience(dot)eu> writes:
> >> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> Sunday 17 April 2011 01:35:45
> >>
> >>> ... Huh? Are you saying that you ask the kernel to map each individual
> >>> shared buffer separately? I can't believe that's going to scale to
> >>> realistic applications.
> >>
> >> No, I do
> >> mrempa(mmap_buff_A, MAP_FIXED, temp);
> >> mremap(shared_buff_Y, MAP_FIXED, mmap_buff_A),
> >> mrempa(tmp, MAP_FIXED, mmap_buff_A).
> >
> > There's no mremap() in the Single Unix Spec, nor on my ancient HPUX box,
> > nor on my quite-up-to-date OS X box. The Linux man page for it says
> > "This call is Linux-specific, and should not be used in programs
> > intended to be portable." So if the patch is dependent on that call,
> > it's dead on arrival from a portability standpoint.
> >
> > But in any case, you didn't explain how use of mremap() avoids the
> > problem of the kernel having to maintain a separate page-mapping-table
> > entry for each individual buffer. (Per process, yet.) If that's what's
> > happening, it's going to be a significant performance penalty as well as
> > (I suspect) a serious constraint on how many buffers can be managed.
>
> I share your suspicions, although no harm in measuring it.
>
> But I don't understand is how this approach avoids the problem of
> different processes seeing different buffer contents. If backend A
> has the buffer mmap'd and backend B wants to modify it (and changes
> the mapping), backend A is still looking at the old buffer contents,
> isn't it? And then things go boom.

Each process has simple "mirror" of shared descriptors.

I "believe" that modifications to buffer content may be only done when holding
exclusive lock (with some simple exceptions) (+ MVCC), actually I saw only two
things that can change already loaded data and cause damage, you have
described (setting hint bits during scan, and vacuum - 1st may only cause, I
think, that two processes will ask for same transaction statuses <except
vacuum>, 2nd one is impossible as vacumm requires exclusive pin). When buffer
tag is changed the version of buffer is bumped up, and checked against local
version - this about reading buffer.

In other cases after obtaining lock check is done if buffer has associated
updatable buffer and if local "mirror" has it too, then swap should take
place.

Logic about updatable buffers is similar to "shared buffers", each updatable
buffer has pin count, and updatable buffer can't be free if someone uses it,
but in contrast to "normal buffers", updatable buffers doesn't have any
support for locking etc. Updatable buffers exists only on free list, or when
associated with buffer.

In future, I will change version to shared segment id, something like
relation's oid + block, but ids will have continuous numbering 1,2,3..., so I
will be able to bypass smgr/md during read, and tag version check - this looks
like faster solution.

Regards,
Radek

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dan Ports 2011-04-17 22:43:36 Re: Formatting Curmudgeons WAS: MMAP Buffers
Previous Message Robert Haas 2011-04-17 20:34:29 Re: Formatting Curmudgeons WAS: MMAP Buffers