Quick Links

Re: Re: custom hash-based COUNT(DISTINCT) aggregate - unexpectedly high memory consumption

From:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To:	Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc:	Tomas Vondra <tv(at)fuzzy(dot)cz>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Re: custom hash-based COUNT(DISTINCT) aggregate - unexpectedly high memory consumption
Date:	2013-10-08 09:42:00
Message-ID:	1D681ED2-9418-45E7-8090-FB66BE5D799F@gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Sent from my iPad

>
> However I'm unable to make it at least comparable to chaining. The trick
> is that the hash table performs reasonably only until it's ~ 65-70% full,
> it gets really slow after that. So to maintain performance comparable to
> chaining, I'd have to keep the table below this threshold, effectively
> wasting ~30% of memory.
I expected that. AFAIK, open addressing gets slow when the load factor approaches 1. I feel this is what is happening here.
>
> And the topic of this thread was about decreasing the memory consumptions,
> so it seems to me open-addressing is not a good match here. I'll try a few
> more things but I don't think it's going to fly.
>
Yeah, I also feel that open addressing isn't the way to go for the problem here.

> I've made some significant improvements in the chaining version (in the
> master branch), now getting about the memory consumption I've estimated.
>
I agree, we can hope to reduce the memory consumption by making changes in the current chaining implementation. I would like to look into changing the data structure used for chaining from singly linked list to maybe skip list or something else.

Regards,

Atri

In response to

Re: Re: custom hash-based COUNT(DISTINCT) aggregate - unexpectedly high memory consumption at 2013-10-08 09:11:57 from Tomas Vondra

Responses

Re: Re: custom hash-based COUNT(DISTINCT) aggregate - unexpectedly high memory consumption at 2013-10-08 10:45:00 from Tomas Vondra

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2013-10-08 09:46:10	Re: Patch for fail-back without fresh backup
Previous Message	Marko Tiikkaja	2013-10-08 09:41:47	Re: plpgsql.print_strict_params