Re: Storing pg_stat_statements query texts externally, pg_stat_statements in core

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Storing pg_stat_statements query texts externally, pg_stat_statements in core
Date: 2014-01-22 18:29:53
Message-ID: 15115.1390415393@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Geoghegan <pg(at)heroku(dot)com> writes:
> On Tue, Jan 21, 2014 at 8:01 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>>> Actually, I think the whole thing is rather badly designed, as warm
>>> cache or no you're still proposing to do thousands of kernel calls
>>> while holding a potentially contended LWLock. I suggest what we
>>> do is (1) read in the whole file, (2) acquire the lock, (3)
>>> re-read the whole file in the same buffer, (4) work from the buffer.

>> fseeko() and fread() are part of the standard library, not any kernel.
>> I don't believe that what I've done in qtext_fetch() implies thousands
>> of kernel calls, or thousands of context switches.

> I ran an strace on a pg_stat_statements backend with a full ~5,000
> entries. It seems that glibc is actually content to always make
> lseek() and read() syscalls in the event of an fseek(), even though it
> does maintain its own buffer and could in principle have a lot more
> gumption about that [1]. I agree that we should copy everything into a
> buffer in one go.

Yeah, that was what I thought might happen. Even if glibc were smarter,
a lot of other libc implementations aren't. Also, ISTM that traversal
of the hash table will generally lead to more-or-less-random access into
the text file. So unless you assume that libc has mmap'd the whole
file, it'd have to do a lot of kernel calls anyway to get different pages
of the file into its buffer at different times.

I think that "read the whole file" is probably going to net out not any
more complex as far as our code goes, and it'll be making a lot fewer
assumptions about how smart the libc level is and what's happening at
the kernel level. I'll have a go at coding it, anyway.

> I think that making the LWLocking duration as brief as possible isn't
> critically important.

Maybe, but I don't like the idea that calling pg_stat_statements()
frequently could result in random glitches in response time for other
queries.

> Automated tools will only be interested in new
> query texts (implying possible I/O while share locking) as they
> observe new entries.

This argument would be more plausible if there were a way to fetch
the text for a previously-observed entry without invoking
pg_stat_statements(). I'm surprised you've not submitted a patch
to add such a function.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2014-01-22 19:11:09 Re: Incorrectly reporting config errors
Previous Message Heikki Linnakangas 2014-01-22 18:07:20 Re: pgsql: Compress GIN posting lists, for smaller index size.