Storing hot members of PGPROC out of the band

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Storing hot members of PGPROC out of the band
Date: 2011-11-03 18:12:52
Message-ID: CABOikdOceQ42p4PP5qY+nD-JkrjicJsVasgXT3an7TadmeWHUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi All,

While working on some of the performance issues on HP-UX, I noticed a
significant data cache misses for accessing PGPROC members. On a close
inspection, it was quite evident that for large number (even few 10s)
of clients, the loop inside GetSnapshotData will cause data cache miss
for almost every PGPROC because the PGPROC structure is quite heavy
and no more than one structure may fit in a single cache line. So I
experimented by separating the most frequently and closely accessed
members of the PGPROC into an out of band array. I call it
PGPROC_MINIMAL structure which contains xid, xmin, vacuumFlags amongst
others. Basically, all the commonly accessed members by
GetSnapshotData find a place in this minimal structure.

When PGPROC array is allocated, we also allocate another array of
PGPROC_MINIMAL structures of the same size. While accessing the
ProcArray, a simple pointer mathematic can get us the corresponding
PGPROC_MINIMAL structure. The only exception being the dummy PGPROC
for prepared transaction. A first cut version of the patch is
attached. It looks big, but most of the changes are cosmetic because
of added indirection. The patch also contains another change to keep
the ProcArray sorted by (PGPROC *) to preserve locality of references
while traversing the array.

I did some tests of a 32 core IA HP-UX box and the results are quite
good. With a scale factor of 100 and -N option of pgbench (updates on
only accounts table), the numbers look something like this:

Clients HEAD PGPROC-Patched Gain
1 1098.488663 1059.830369 -3.52%
4 3569.481435 3663.898254 2.65%
32 11627.059228 16419.864056 41.22%
48 11044.501244 15825.132582 43.29%
64 10432.206525 15408.50304 47.70%
80 10210.57835 15170.614435 48.58%

The numbers are quite reproducible with couple of percentage points
variance. So even for single client, I sometimes see no degradation.
Here are some more numbers with the normal pgbench tests (without -N
option).

Clients HEAD PGPROC-Patched Gain
1 743 771 3.77%
4 1821 2315 27.13%
32 8011 9166 14.42%
48 7282 8959 23.03%
64 6742 8937 32.56%
80 6316 8664 37.18%

Its quite possible that the effect of the patch is more evident on the
particular hardware that I am testing. But the approach nevertheless
seems reasonable. It will very useful if someone else having access to
a large box can test the effect of the patch.

BTW, since I played with many versions of the patch, the exact numbers
with this version might be a little different than what I posted
above. I will conduct more tests, especially with more number of
clients and see if there is any difference in the improvement.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

Attachment Content-Type Size
pgproc-minimal-v13.patch text/x-patch 37.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2011-11-03 18:14:39 Further plans to refactor xlog.c
Previous Message Alvaro Herrera 2011-11-03 18:12:49 foreign key locks, 2nd attempt