Re: Single client performance on trivial SELECTs

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Single client performance on trivial SELECTs
Date: 2011-04-14 18:15:00
Message-ID: BANLkTinKZ05HqyORBq8puC4kK8j2pZ0hLw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 14, 2011 at 7:43 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I doubt that it's possible to make AllocSetAlloc radically cheaper.
> I think the more likely route to improvement there is going to be to
> find a way to do fewer pallocs.  For instance, if we had more rigorous
> rules about which data structures are read-only to which code, we could
> probably get rid of a lot of just-in-case tree copying that happens in
> the parser and planner.
>
> But at the same time, even if we could drive all palloc costs to zero,
> it would only make a 10% difference in this example.  And this sort of
> fairly flat profile is what I see in most cases these days --- we've
> been playing performance whack-a-mole for long enough now that there
> isn't much low-hanging fruit left.

There are other architectural approaches that we could take to
reducing the parsing overhead. Random ideas:

1. Separate the parser into a parser for DML statements only, and
another parser for everything else, to avoid cluttering the main
parser with lots of keywords and non-terminals that aren't going to be
used for the kinds of queries people care about parsing speed.

2. Hand-code a parser, or use some tool other than bison to generate one.

3. Some kind of parse-tree cache, so that executing the same exact
statement many times doesn't require reparsing it every time.

It's fairly far down in the noise on this particular profile, but in
the low-hanging fruit department, I think we should fix
ScanKeywordLookup to use a smarter algorithm that is more like O(1)
rather than O(lg n) in the number of keywords. It shouldn't be
terribly difficult to come up with some kind of hash function based
on, say, the first two characters of the keyword that would be a lot
faster than what we're doing now.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-04-14 18:23:15 Re: POSIX shared memory redux
Previous Message David Fetter 2011-04-14 15:41:45 Re: Single client performance on trivial SELECTs