Re: [rfc] overhauling pgstat.stat

From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Satoshi Nagayasu <snaga(at)uptime(dot)jp>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [rfc] overhauling pgstat.stat
Date: 2013-09-04 06:23:20
Message-ID: 748895BD-08D6-4668-8320-C112CAA21963@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sent from my iPad

On 04-Sep-2013, at 10:54, Satoshi Nagayasu <snaga(at)uptime(dot)jp> wrote:

> Hi,
>
> (2013/09/04 12:52), Atri Sharma wrote:
>> On Wed, Sep 4, 2013 at 6:40 AM, Satoshi Nagayasu <snaga(at)uptime(dot)jp> wrote:
>>> Hi,
>>>
>>> I'm considering overhauling pgstat.stat, and would like to know how many
>>> people are interested in this topic.
>>>
>>> As you may know, this file could be handreds of MB in size, because
>>> pgstat.stat holds all access statistics in each database, and it needs
>>> to read/write an entire pgstat.stat frequently.
>>>
>>> As a result, pgstat.stat often generates massive I/O operation,
>>> particularly when having a large number of tables in the database.
>>>
>>> To support multi-tenancy or just a large number of tables (up to 10k
>>> tables in single database), I think pgstat.stat needs to be overhauled.
>>>
>>> I think using heap and btree in pgstat.stat would be preferred to reduce
>>> read/write and to allow updating access statistics for specific tables
>>> in pgstat.stat file.
>>>
>>> Is this good for us?
>>
>> Hi,
>>
>> Nice thought. I/O reduction in pgstat can be really helpful.
>>
>> I am trying to think of our aim here. Would we be looking to split
>> pgstat per table, so that the I/O write happens for only a portion of
>> pgstat? Or reduce the I/O in general?
>
> I prefer the latter.
>
> Under the current implementation, DBA need to split single database
> into many smaller databases with considering access locality of the
> tables. It's difficult and could be change in future.
>
> And splitting the statistics data into many files (per table,
> for example) would cause another performance issue when
> collecting/showing statistics at once. Just my guess though.
>
> So, I'm looking for a new way to reduce I/O for the statistics data
> in general.
>
> Regards,
>
>>
>> If the later, how would using BTree help us? I would rather go for a
>> range tree or something. But again, I may be completely wrong.
>>
>> Please elaborate a bit more on the solution we are trying to
>> achieve.It seems really interesting.
>>
>> Regards,
>>
>> Atri
>>
>>
>
>

Right,thanks.

How would using heap and BTree help here? Are we looking at a priority queue which supports the main storage system of the stats?

Regards,

Atri

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Boszormenyi Zoltan 2013-09-04 08:06:24 Re: ECPG FETCH readahead
Previous Message Satoshi Nagayasu 2013-09-04 05:24:58 Re: [rfc] overhauling pgstat.stat