Re: Hash partitioning.

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Yuri Levinsky <yuril(at)celltick(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christopher Browne <cbbrowne(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Mailing Lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hash partitioning.
Date: 2013-06-26 18:14:05
Message-ID: CAGTBQpYxs=CKm1dFT_1KEwbguxdWm95XT-nF76n_CM2NrOa5xA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 26, 2013 at 11:14 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> On Wed, Jun 26, 2013 at 05:10:00PM +0300, Heikki Linnakangas wrote:
>> In practice, there might be a lot of quirks and inefficiencies and
>> locking contention etc. involved in various DBMS's, that you might
>> be able to work around with hash partitioning. But from a
>> theoretical point of view, there is no reason to expect just
>> partitioning a table on a hash to make key-value lookups any faster.
>
> Good analysis. Has anyone benchmarked this to know our btree is
> efficient in this area?

Yep. I had at one point, and came to the same conclusion. I ended up
building a few partial indices, and have been happy ever since.
Granted, my DB isn't that big, just around 200G.

No, I don't have the benchmark results. It's been a while. Back then,
it was 8.3, so I did the partitioning on the application. It still
wasn't worth it.

Now I just have two indices. One that indexes only hot tuples, it's
very heavily queried and works blazingly fast, and one that indexes by
(hotness, key). I include the hotness value on the query, and still
works quite fast enough. Luckily, I know things become cold after an
update to mark them cold, so I can do that. I included hotness on the
index to cluster updates on the hot part of the index, but I could
have just used a regular index and paid a small price on the updates.
Indeed, for a while it worked without the hotness, and there was no
significant trouble. I later found out that WAL bandwidth was
noticeably decreased when I added that hotness column, so I did, helps
a bit with replication. Has worked ever since.

Today, I only use "partitioning" to split conceptually different
tables, so I can have different schemas for each table (and normalize
with a view). Now it's 9.2, so the view works quite nicely and
transparently. I have yet to find a use for hash partitioning.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rodrigo Gonzalez 2013-06-26 18:22:06 Re: Kudos for Reviewers -- straw poll
Previous Message Bruce Momjian 2013-06-26 18:13:32 Re: Kudos for Reviewers -- straw poll