hash taboo?

Lists: pgsql-general
From: admin <admin(at)wtbwts(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: hash taboo?
Date: 1999-12-17 22:17:03
Message-ID: Pine.BSF.4.10.9912172211140.8286-100000@server.b0x.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

I've been reading the postgresql manual and I find there is very little
discussion about hash compared to btree. Most of the focus seems to be on
using btree indices even that the default for 'create index' is btree
also. From the documentation, it seems the only difference between either
searching method is that btree can be used with multiple operators whilst
hash can only be used with '='. Furthermore, hash seems to be contained in
memory, so should be limited to small queries or, in my case, queries
using limit (without using sort which would need to retrieve the entire
data anyways).

My conclusion is that if I can live with just using '=' and using slightly
more memory, I should be using hash. Unfortunately, there is very little
sign in the documentation that I should be using hash at all. Perhaps I
have missed something important.

If someone could help me make a more rational decision on using searching
methods, I'd appreciate.

Thanks in advance,
Marc


From: admin <admin(at)wtbwts(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] hash taboo?
Date: 1999-12-17 22:41:58
Message-ID: Pine.BSF.4.10.9912172236100.8409-100000@server.b0x.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

My results were exactly the same for btree and hash, even when vacumming
between each index creation. Here's my query:
SELECT * FROM prod_base WHERE mid='2';

Here's my result:
Index Scan using prod_mid_idx on prod_base (cost=2.05 rows=2 width=120)

My database is perhaps not big enough to run some relevant tests, so
please let me know if there's another way I could get a better idea of the
resources used for using each searching method.

> Run some performace tests and let us know.
>
> > I've been reading the postgresql manual and I find there is very little
> > discussion about hash compared to btree. Most of the focus seems to be on
> > using btree indices even that the default for 'create index' is btree
> > also. From the documentation, it seems the only difference between either
> > searching method is that btree can be used with multiple operators whilst
> > hash can only be used with '='. Furthermore, hash seems to be contained in
> > memory, so should be limited to small queries or, in my case, queries
> > using limit (without using sort which would need to retrieve the entire
> > data anyways).
> >
> > My conclusion is that if I can live with just using '=' and using slightly
> > more memory, I should be using hash. Unfortunately, there is very little
> > sign in the documentation that I should be using hash at all. Perhaps I
> > have missed something important.
> >
> > If someone could help me make a more rational decision on using searching
> > methods, I'd appreciate.
> >
> > Thanks in advance,
> > Marc
> >
> >
> > ************
> >
> >
>
>
> --
> Bruce Momjian | http://www.op.net/~candle
> maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
> + If your life is a hard drive, | 830 Blythe Avenue
> + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
>


From: admin <admin(at)wtbwts(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] hash taboo?
Date: 1999-12-17 22:54:22
Message-ID: Pine.BSF.4.10.9912172250001.8458-100000@server.b0x.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Excellent point, your last comment gives me a tangible incentive for using
hash instead of btree. Since I don't need to use other operators than '=',
there is really no need to spend extra time creating a btree while all I
need is a hash table. In the end, both are as fast for searching, but I
gain some additional speed for inserting and removing entries.

> > My results were exactly the same for btree and hash, even when vacumming
> > between each index creation. Here's my query:
> > SELECT * FROM prod_base WHERE mid='2';
> >
> > Here's my result:
> > Index Scan using prod_mid_idx on prod_base (cost=2.05 rows=2 width=120)
> >
> > My database is perhaps not big enough to run some relevant tests, so
> > please let me know if there's another way I could get a better idea of the
> > resources used for using each searching method.
>
> You have to look at index creation speed and index access speed.
>
> Not sure which one wins in each category. Also, index modification
> speed may be important.

Thanks again,
Marc


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: admin <admin(at)wtbwts(dot)com>
Cc: pgsql-general(at)postgreSQL(dot)org
Subject: Re: [GENERAL] hash taboo?
Date: 1999-12-18 03:24:06
Message-ID: 199912180324.WAA03818@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


Run some performace tests and let us know.

> I've been reading the postgresql manual and I find there is very little
> discussion about hash compared to btree. Most of the focus seems to be on
> using btree indices even that the default for 'create index' is btree
> also. From the documentation, it seems the only difference between either
> searching method is that btree can be used with multiple operators whilst
> hash can only be used with '='. Furthermore, hash seems to be contained in
> memory, so should be limited to small queries or, in my case, queries
> using limit (without using sort which would need to retrieve the entire
> data anyways).
>
> My conclusion is that if I can live with just using '=' and using slightly
> more memory, I should be using hash. Unfortunately, there is very little
> sign in the documentation that I should be using hash at all. Perhaps I
> have missed something important.
>
> If someone could help me make a more rational decision on using searching
> methods, I'd appreciate.
>
> Thanks in advance,
> Marc
>
>
> ************
>
>

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: admin <admin(at)wtbwts(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] hash taboo?
Date: 1999-12-18 03:46:14
Message-ID: 199912180346.WAA06174@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> My results were exactly the same for btree and hash, even when vacumming
> between each index creation. Here's my query:
> SELECT * FROM prod_base WHERE mid='2';
>
> Here's my result:
> Index Scan using prod_mid_idx on prod_base (cost=2.05 rows=2 width=120)
>
> My database is perhaps not big enough to run some relevant tests, so
> please let me know if there's another way I could get a better idea of the
> resources used for using each searching method.

You have to look at index creation speed and index access speed.

Not sure which one wins in each category. Also, index modification
speed may be important.

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: admin <admin(at)wtbwts(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] hash taboo?
Date: 1999-12-18 03:56:03
Message-ID: 199912180356.WAA06915@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> Excellent point, your last comment gives me a tangible incentive for using
> hash instead of btree. Since I don't need to use other operators than '=',
> there is really no need to spend extra time creating a btree while all I
> need is a hash table. In the end, both are as fast for searching, but I
> gain some additional speed for inserting and removing entries.

Is the hash faster to create?

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026