Re: hash_create API changes (was Re: speedup tidbitmap patch: hash BlockNumber)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, David Rowley <dgrowleyml(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: hash_create API changes (was Re: speedup tidbitmap patch: hash BlockNumber)
Date: 2014-12-20 17:51:19
Message-ID: 4814.1419097879@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2014-12-19 22:03:55 -0600, Jim Nasby wrote:
>> What I am thinking is not using all of those fields in their raw form to calculate the hash value. IE: something analogous to:
>> hash_any(SharedBufHash, (rot(forkNum, 2) | dbNode) ^ relNode) << 32 | blockNum)
>>
>> perhaps that actual code wouldn't work, but I don't see why we couldn't do something similar... am I missing something?

> I don't think that'd improve anything. Jenkin's hash does have a quite
> mixing properties, I don't believe that the above would improve the
> quality of the hash.

I think what Jim is suggesting is to intentionally degrade the quality of
the hash in order to let it be calculated a tad faster. We could do that
but I doubt it would be a win, especially in systems with lots of buffers.
IIRC, when we put in Jenkins hashing to replace the older homebrew hash
function, it improved performance even though the hash itself was slower.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2014-12-20 17:55:30 Re: Initdb-cs_CZ.WIN-1250 buildfarm failures
Previous Message Andrew Dunstan 2014-12-20 17:50:34 Re: Initdb-cs_CZ.WIN-1250 buildfarm failures