Re: Single client performance on trivial SELECTs

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Single client performance on trivial SELECTs
Date: 2011-04-14 19:36:09
Message-ID: 4DA74CA9.1070100@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 14.04.2011 17:43, Tom Lane wrote:
> Greg Smith<greg(at)2ndquadrant(dot)com> writes:
>> samples % image name symbol name
>> 53548 6.7609 postgres AllocSetAlloc
>> 32787 4.1396 postgres MemoryContextAllocZeroAligned
>> 26330 3.3244 postgres base_yyparse
>> 21723 2.7427 postgres hash_search_with_hash_value
>> 20831 2.6301 postgres SearchCatCache
>> 19094 2.4108 postgres hash_seq_search
>> 18402 2.3234 postgres hash_any
>> 15975 2.0170 postgres AllocSetFreeIndex
>> 14205 1.7935 postgres _bt_compareSince
>> 13370 1.6881 postgres core_yylex
>> 10455 1.3200 postgres MemoryContextAlloc
>> 10330 1.3042 postgres LockAcquireExtended
>> 10197 1.2875 postgres ScanKeywordLookup
>> 9312 1.1757 postgres MemoryContextAllocZero
>
> Yeah, this is pretty typical ...

In this case you could just use prepared statements and get rid of all
the parser related overhead, which includes much of the allocations.

>> I don't know nearly enough about the memory allocator to comment on
>> whether it's possible to optimize it better for this case to relieve any
>> bottleneck.
>
> I doubt that it's possible to make AllocSetAlloc radically cheaper.
> I think the more likely route to improvement there is going to be to
> find a way to do fewer pallocs. For instance, if we had more rigorous
> rules about which data structures are read-only to which code, we could
> probably get rid of a lot of just-in-case tree copying that happens in
> the parser and planner.
>
> But at the same time, even if we could drive all palloc costs to zero,
> it would only make a 10% difference in this example. And this sort of
> fairly flat profile is what I see in most cases these days --- we've
> been playing performance whack-a-mole for long enough now that there
> isn't much low-hanging fruit left. For better or worse, the system
> design we've chosen just isn't amenable to minimal overhead for simple
> queries. I think a lot of this ultimately traces to the extensible,
> data-type-agnostic design philosophy. The fact that we don't know what
> an integer is until we look in pg_type, and don't know what an "="
> operator does until we look up its properties, is great from a flexibility
> point of view; but this sort of query is where the costs become obvious.

I think the general strategy to make this kind of queries faster will be
to add various fastpaths to cache and skip even more work. For example,

There's one very low-hanging fruit here, though. I profiled the pgbench
case, with -M prepared, and found that like in Greg Smith's profile,
hash_seq_search pops up quite high in the list. Those calls are coming
from LockReleaseAll(), where we scan the local lock hash to find all
locks held. We specify the initial size of the local lock hash table as
128, which is unnecessarily large for small queries like this. Reducing
it to 8 slashed the time spent in hash_seq_search().

I think we should make that hash table smaller. It won't buy much,
somewhere between 1-5 % in this test case, but it's very easy to do and
I don't see much downside, it's a local hash table so it will grow as
needed.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2011-04-14 19:45:26 Re: Single client performance on trivial SELECTs
Previous Message Robert Haas 2011-04-14 18:54:39 Re: WIP: Allow SQL-language functions to reference parameters by parameter name