Re: Hash partitioning.

From: "Yuri Levinsky" <yuril(at)celltick(dot)com>
To: "Heikki Linnakangas" <hlinnakangas(at)vmware(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Christopher Browne" <cbbrowne(at)gmail(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "PostgreSQL Mailing Lists" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hash partitioning.
Date: 2013-06-26 13:41:37
Message-ID: B72526FA2066E344AFD09734A487318103E92BB3@falcon1.celltick.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki,
As far as I understand the height of the btree will affect the number of
I/Os necessary. The height of the tree does not increase linearly with
the number of records. May be I wrong in terminology but when I am
trying to insert data into empty table the insertion time is increasing
when number of records is growing. In order to keep indexes as small as
possible I usually split the table by hash if I don't have any better
alternative. On some systems hash functions +index might work faster
when only index for insert and search operations. This especially usable
when you have non unique index with small number of possible values that
you don't know in advance or that changing between your customers. In
that case the hash partition has to be used instead of index.

Sincerely yours,

Yuri Levinsky, DBA
Celltick Technologies Ltd., 32 Maskit St., Herzliya 46733, Israel
Mobile: +972 54 6107703, Office: +972 9 9710239; Fax: +972 9 9710222

-----Original Message-----
From: Heikki Linnakangas [mailto:hlinnakangas(at)vmware(dot)com]
Sent: Wednesday, June 26, 2013 2:23 PM
To: Yuri Levinsky
Cc: Tom Lane; Christopher Browne; Robert Haas; Bruce Momjian; PostgreSQL
Mailing Lists
Subject: Re: [HACKERS] Hash partitioning.

On 26.06.2013 11:17, Yuri Levinsky wrote:
> The main purpose of partitioning in my world is to store billions of
> rows and be able to search by date, hour or even minute as fast as
> possible.

Hash partitioning sounds like a bad fit for that use case. A regular
b-tree, possibly with range partitioning, sounds optimal for that.

> When you dealing with company, which has
> ~350.000.000 users, and you don't want to use key/value data stores:
> you need hash partitioned tables and hash partitioned table clusters
> to perform fast search and 4-6 tables join based on user phone number
> for example.

B-trees are surprisingly fast for key-value lookups. There is no reason
to believe that a hash partitioned table would be faster for that than a
plain table.

- Heikki

This mail was received via Mail-SeCure System.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2013-06-26 13:42:57 Re: A better way than tweaking NTUP_PER_BUCKET
Previous Message Amit Kapila 2013-06-26 13:37:50 Re: Optimizing pglz compressor