Re: Analysis on backend-private memory usage (and a patch)

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Analysis on backend-private memory usage (and a patch)
Date: 2013-09-05 10:29:23
Message-ID: 52285D03.3040802@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 04.09.2013 23:56, Tom Lane wrote:
> Heikki Linnakangas<hlinnakangas(at)vmware(dot)com> writes:
>> One fairly simple thing we could do is to teach catcache.c to resize the
>> caches. Then we could make the initial size of all the syscaches much
>> smaller.
>
> I think this is attractive for the *other* reason you mention, namely
> preserving reasonable performance when a catcache grows larger than
> expected; but I'm pretty skeptical of nickel-and-diming caches that are
> already really small. Is it really worth cutting the TSPARSER caches
> from 4 pointers to 2 for instance?

Yeah, that may be overdoing it. Then again, enlarging a hash table from
2 to 4 entries when needed is also very cheap.

> What concerns me about initially-undersized caches is that we'll waste
> space and time in the enlargement process.

Enlarging a small hash tables is very cheap because, well, it's small.
Enlarging a large hash table is more expensive, but if you have a lot of
entries in the hash, then you also get the benefit of a larger hash when
doing lookups. It does require some memory to hold the old and the new
hash table while rehashing, but again, with a small hash table that's
not significant, and with a large one the actual cached tuples take a
lot more space than the buckets array anyway.

Per Wikipedia [1]:

> If the table size increases or decreases by a fixed percentage at each expansion, the total cost of these resizings, amortized over all insert and delete operations, is still a constant, independent of the number of entries n and of the number m of operations performed.
>
> For example, consider a table that was created with the minimum possible size and is doubled each time the load ratio exceeds some threshold. If m elements are inserted into that table, the total number of extra re-insertions that occur in all dynamic resizings of the table is at most m − 1. In other words, dynamic resizing roughly doubles the cost of each insert or delete operation.

Considering the amount of work involved in adding an entry to a catalog
cache, I wouldn't worry about adding a tiny constant to the insertion time.

I did some quick testing by creating 100000 tables, and running a
pgbench script that selects randomly from them:

\setrandom tableno 1 100000
select * from foo:tableno where i = 1;

I timed the rehash operations with gettimeofday calls before and after
the rehash:

> LOG: rehashed catalog cache id 44 for pg_class from 256 to 512 buckets: 27 us
> LOG: rehashed catalog cache id 47 for pg_statistic from 128 to 256 buckets: 14 us
> LOG: rehashed catalog cache id 44 for pg_class from 512 to 1024 buckets: 54 us
> LOG: rehashed catalog cache id 45 for pg_class from 256 to 512 buckets: 29 us
> LOG: rehashed catalog cache id 47 for pg_statistic from 256 to 512 buckets: 30 us
> LOG: rehashed catalog cache id 44 for pg_class from 1024 to 2048 buckets: 147 us
> LOG: rehashed catalog cache id 7 for pg_attribute from 512 to 1024 buckets: 87 us
> LOG: rehashed catalog cache id 45 for pg_class from 512 to 1024 buckets: 80 us
> LOG: rehashed catalog cache id 47 for pg_statistic from 512 to 1024 buckets: 88 us
> LOG: rehashed catalog cache id 44 for pg_class from 2048 to 4096 buckets: 342 us
> LOG: rehashed catalog cache id 7 for pg_attribute from 1024 to 2048 buckets: 197 us
> LOG: rehashed catalog cache id 45 for pg_class from 1024 to 2048 buckets: 183 us
> LOG: rehashed catalog cache id 47 for pg_statistic from 1024 to 2048 buckets: 194 us
> LOG: rehashed catalog cache id 44 for pg_class from 4096 to 8192 buckets: 764 us
> LOG: rehashed catalog cache id 7 for pg_attribute from 2048 to 4096 buckets: 401 us
> LOG: rehashed catalog cache id 45 for pg_class from 2048 to 4096 buckets: 383 us
> LOG: rehashed catalog cache id 47 for pg_statistic from 2048 to 4096 buckets: 406 us
> LOG: rehashed catalog cache id 44 for pg_class from 8192 to 16384 buckets: 1758 us
> LOG: rehashed catalog cache id 7 for pg_attribute from 4096 to 8192 buckets: 833 us
> LOG: rehashed catalog cache id 45 for pg_class from 4096 to 8192 buckets: 842 us
> LOG: rehashed catalog cache id 47 for pg_statistic from 4096 to 8192 buckets: 859 us
> LOG: rehashed catalog cache id 44 for pg_class from 16384 to 32768 buckets: 3564 us
> LOG: rehashed catalog cache id 7 for pg_attribute from 8192 to 16384 buckets: 1769 us
> LOG: rehashed catalog cache id 45 for pg_class from 8192 to 16384 buckets: 1752 us
> LOG: rehashed catalog cache id 47 for pg_statistic from 8192 to 16384 buckets: 1719 us
> LOG: rehashed catalog cache id 44 for pg_class from 32768 to 65536 buckets: 7538 us
> LOG: rehashed catalog cache id 7 for pg_attribute from 16384 to 32768 buckets: 3644 us
> LOG: rehashed catalog cache id 45 for pg_class from 16384 to 32768 buckets: 3609 us
> LOG: rehashed catalog cache id 47 for pg_statistic from 16384 to 32768 buckets: 3508 us
> LOG: rehashed catalog cache id 44 for pg_class from 65536 to 131072 buckets: 16457 us
> LOG: rehashed catalog cache id 7 for pg_attribute from 32768 to 65536 buckets: 7978 us
> LOG: rehashed catalog cache id 45 for pg_class from 32768 to 65536 buckets: 8281 us
> LOG: rehashed catalog cache id 47 for pg_statistic from 32768 to 65536 buckets: 7724 us

The time spent in rehashing seems to be about 60 ns per catcache entry
(the patch rehashes when fillfactor reaches 2, so when rehashing e.g
from 256 to 512 buckets, there are 1024 entries in the hash), at the
larger hash sizes.

> I'd suggest trying to get some
> numbers about the typical size of each cache in a backend that's done a
> few things (not merely started up --- we should not be optimizing for the
> case of connections that get abandoned without running any queries).
> Then set the initial size to the next larger power of 2.

Makes sense.

I ran pgbench for ten seconds, and printed the number of tuples in each
catcache after that:

LOG: cache id 61 on (not known yet): 0 tups
LOG: cache id 60 on (not known yet): 0 tups
LOG: cache id 59 on pg_type: 6 tups
LOG: cache id 58 on (not known yet): 0 tups
LOG: cache id 57 on (not known yet): 0 tups
LOG: cache id 56 on (not known yet): 0 tups
LOG: cache id 55 on (not known yet): 0 tups
LOG: cache id 54 on (not known yet): 0 tups
LOG: cache id 53 on (not known yet): 0 tups
LOG: cache id 52 on (not known yet): 0 tups
LOG: cache id 51 on (not known yet): 0 tups
LOG: cache id 50 on (not known yet): 0 tups
LOG: cache id 49 on (not known yet): 0 tups
LOG: cache id 48 on pg_tablespace: 1 tups
LOG: cache id 47 on pg_statistic: 11 tups
LOG: cache id 46 on (not known yet): 0 tups
LOG: cache id 45 on pg_class: 4 tups
LOG: cache id 44 on pg_class: 8 tups
LOG: cache id 43 on (not known yet): 0 tups
LOG: cache id 42 on pg_proc: 4 tups
LOG: cache id 41 on pg_proc: 1 tups
LOG: cache id 40 on (not known yet): 0 tups
LOG: cache id 39 on (not known yet): 0 tups
LOG: cache id 38 on pg_operator: 2 tups
LOG: cache id 37 on pg_operator: 2 tups
LOG: cache id 36 on (not known yet): 0 tups
LOG: cache id 35 on pg_namespace: 3 tups
LOG: cache id 34 on (not known yet): 0 tups
LOG: cache id 33 on (not known yet): 0 tups
LOG: cache id 32 on pg_index: 5 tups
LOG: cache id 31 on (not known yet): 0 tups
LOG: cache id 30 on (not known yet): 0 tups
LOG: cache id 29 on (not known yet): 0 tups
LOG: cache id 28 on (not known yet): 0 tups
LOG: cache id 27 on (not known yet): 0 tups
LOG: cache id 26 on (not known yet): 0 tups
LOG: cache id 25 on (not known yet): 0 tups
LOG: cache id 24 on (not known yet): 0 tups
LOG: cache id 23 on (not known yet): 0 tups
LOG: cache id 22 on (not known yet): 0 tups
LOG: cache id 21 on pg_database: 1 tups
LOG: cache id 20 on (not known yet): 0 tups
LOG: cache id 19 on (not known yet): 0 tups
LOG: cache id 18 on (not known yet): 0 tups
LOG: cache id 17 on (not known yet): 0 tups
LOG: cache id 16 on (not known yet): 0 tups
LOG: cache id 15 on (not known yet): 0 tups
LOG: cache id 14 on (not known yet): 0 tups
LOG: cache id 13 on (not known yet): 0 tups
LOG: cache id 12 on pg_cast: 1 tups
LOG: cache id 11 on pg_authid: 1 tups
LOG: cache id 10 on pg_authid: 1 tups
LOG: cache id 9 on (not known yet): 0 tups
LOG: cache id 8 on (not known yet): 0 tups
LOG: cache id 7 on pg_attribute: 6 tups
LOG: cache id 6 on (not known yet): 0 tups
LOG: cache id 5 on (not known yet): 0 tups
LOG: cache id 4 on pg_amop: 1 tups
LOG: cache id 3 on pg_amop: 2 tups
LOG: cache id 2 on pg_am: 1 tups
LOG: cache id 1 on (not known yet): 0 tups
LOG: cache id 0 on (not known yet): 0 tups
CacheMemoryContext: 516096 total in 6 blocks; 81192 free (0 chunks);
434904 used

I'm surprised how few rows there are in the caches after that. Actually,
given that, and the above timings, I'm starting to think that we should
just get rid of the hand-tuned initial sizes altogether. Start all
caches small, say only 1 entries, and just let the automatic rehashing
enlarge them as required. Of course, a more complicated query will touch
many more catalogs, but resizing is cheap enough that it doesn't really
matter.

PS. Once the hashes are resized on demand, perhaps we should get rid of
the move-to-the-head behavior in SearchCatCache. If all the buckets
contain < 2 entries on average, moving the least-recently-used one to
the front hardly makes any difference in lookup times. But it does add a
couple of cycles to every cache hit.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu kommi 2013-09-05 11:43:03 Re: Performance Improvement by reducing WAL for Update Operation
Previous Message Sameer Thakur 2013-09-05 09:44:55 Extending pg_stat_statements to expose queryid