A population of population counts

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: A population of population counts
Date: 2016-05-07 00:41:15
Message-ID: CAEepm=3k++Ytf2LNQCvpP6m1=gY9zZHP_cfnn47=WTsoCrLCvA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

I noticed that we have three "number_of_ones" tables under contrib and
two under src, and some new specially masked variants for visibility
maps.

Would it be an improvement if we just defined one table with external
linkage, and accessed it via a macros/functions popcount_uint8, and
wider versions _uint32, popcount_array(data, len) that sum the
popcounts of their component bytes?

Then there would be less duplication, and future opportunities to use
fancy built-ins/assembler instructions/vectorisation in one central
place, and to work in larger sizes than bytes.

Perhaps we could also get rid of the new special masked popcount
tables by masking the value we look up instead, eg walk through the
page calling popcount_uint64(value & FROZEN_BITMASK_64).

As for the rightmost_one_pos table in bitmapset.c, couldn't the
various bms_XXX functions just use ffs(n) - 1 and work word-at-a-time?
That generates a bsf instruction at -O2 on this machine.

The micro-optimisation opportunities probably don't matter, but I
wondered if it might at least be interesting to delete a bunch of
code, and re-use a standard interface for something that apparently
several modules need to do.

--
Thomas Munro
http://www.enterprisedb.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2016-05-07 00:43:24 Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Previous Message Peter Geoghegan 2016-05-06 23:58:39 "Allow usage of huge maintenance_work_mem for GIN build" patch