Re: hash_create API changes (was Re: speedup tidbitmap patch: hash BlockNumber)

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, David Rowley <dgrowleyml(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: hash_create API changes (was Re: speedup tidbitmap patch: hash BlockNumber)
Date: 2014-12-20 05:15:42
Message-ID: 20141220051542.GM5023@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-12-19 22:03:55 -0600, Jim Nasby wrote:
> I'm not suggesting we change BufferTag or BufferLookupEnt; clearly we
> can't simply throw away any of the fields I was talking about (well,
> except possibly tablespace ID. AFAICT that's completely redundant for
> searching because relid is UNIQUE).

It's actually not. BufferTag's contain relnodes via RelFileNode - that's
not the relation's oid, but the filenode. And that's *not* guranteed
unique across database unfortunately.

> What I am thinking is not using all of those fields in their raw form to calculate the hash value. IE: something analogous to:
>
> hash_any(SharedBufHash, (rot(forkNum, 2) | dbNode) ^ relNode) << 32 | blockNum)
>
> perhaps that actual code wouldn't work, but I don't see why we couldn't do something similar... am I missing something?

I don't think that'd improve anything. Jenkin's hash does have a quite
mixing properties, I don't believe that the above would improve the
quality of the hash.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2014-12-20 06:19:58 Re: pgsql: Allow pushdown of WHERE quals into subqueries with window functi
Previous Message Andres Freund 2014-12-20 05:15:04 Re: Commitfest problems