Re: PrivateRefCount (for 8.3)

From: NikhilS <nikkhils(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PrivateRefCount (for 8.3)
Date: 2007-01-16 09:55:58
Message-ID: d3c4af540701160155j23e4e846uae30a7b919f49aff@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Most likely a waste of development effort --- have you got any evidence
> of a real effect here? With 200 max_connections the size of the arrays
> is still less than 10% of the space occupied by the buffers themselves,
> ergo there isn't going to be all that much cache-thrashing compared to
> what happens in the buffers themselves. You're going to be hard pressed
> to buy back the overhead of the hashing.
>
> It might be interesting to see whether we could shrink the refcount
> entries to int16 or int8. We'd need some scheme to deal with overflow,
> but given that the counts are now backed by ResourceOwner entries, maybe
> extra state could be kept in those entries to handle it.

I did some instrumentation coupled with pgbench/dbt2/views/join query runs
to find out the following:

(a) Maximum number of buffers pinned simultaneously by a backend: 6-9

(b) Maximum value of simultaneous pins on a given buffer by a backend: 4-6

(a) indicates that for large shared_buffers value we will end up with space
wastage due to a big PrivateRefCount array per backend (current allocation
is (int32 * shared_buffers)).

(b) indicates that the refcount to be tracked per buffer is a small enough
value. And Tom's suggestion of exploring int16 or int8 might be worthwhile.

Following is the Hash Table based proposal based on the above readings:

- Do away with allocating NBuffers sized PrivateRefCount array which is
an allocation of (NBuffers * int).
- Define Pvt_RefCnt_Size to be 64 (128?) or some such value so as to be
multiples
ahead of the above observed ranges. Define Overflow_Size to be 8 or some
similar small value to handle collisions.

- Define the following Hash Table entry to keep track of reference counts

struct HashRefCntEnt
{
int32 BufferId;
int32 RefCnt;
int32 NextEnt; /* To handle collisions */
};

- Define a similar Overflow Table entry as above to handle collisions.

An array HashRefCntTable of such HashRefCntEnt'ries of size Pvt_RefCnt_Size
will get
initialized in the InitBufferPoolAccess function.

An OverflowTable of size Overflow_Size will be allocated. This array will be
sized dynamically (2* current Overflow_Size) to accomodate more entries if
it cannot accomodate further collisions in the main table.

We do not want the overhead of a costly hashing function. So we will use
(%Pvt_RefCnt_Size i.e modulo Pvt_RefCnt_Size) to get the index where the
buffer
needs to go. In short our hash function is (bufid % Pvt_RefCnt_Size) which
should be a cheap enough operation.

Considering that 9-10 buffers will be needed, the probability of collisions
will be less. Collisions will arise only if buffers with ids (x, x +
Pvt_RefCnt_Size, x + 2*Pvt_RefCnt_Size etc.) get used in the same operation.
This should be pretty rare.

Functions PinBuffer, PinBuffer_Locked, IncrBufferRefCount, UnpinBuffer etc.
will be modified to consider the above mechanism properly. The changes will
be localized in the buf_init.c and bufmgr.c files only.

Comments please.

Regards,
Nikhils

--
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hubert FONGARNAND 2007-01-16 10:26:37 Temparary disable constraint
Previous Message Magnus Hagander 2007-01-16 09:14:26 Re: [HACKERS] Checkpoint request failed on version 8.2.1.