Re: [rfc] overhauling pgstat.stat

From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Satoshi Nagayasu <snaga(at)uptime(dot)jp>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tomas Vondra <tv(at)fuzzy(dot)cz>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [rfc] overhauling pgstat.stat
Date: 2013-09-05 03:54:57
Message-ID: 969AF419-93E5-45A3-846A-357F94F77DAB@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sent from my iPad

On 05-Sep-2013, at 8:58, Satoshi Nagayasu <snaga(at)uptime(dot)jp> wrote:

> (2013/09/05 3:59), Alvaro Herrera wrote:
>> Tomas Vondra wrote:
>>
>>> My idea was to keep the per-database stats, but allow some sort of
>>> "random" access - updating / deleting the records in place, adding
>>> records etc. The simplest way I could think of was adding a simple
>>> "index" - a mapping of OID to position in the stat file.
>>>
>>> I.e. a simple array of (oid, offset) pairs, stored in oid.stat.index or
>>> something like that. This would make it quite simple to access existing
>>> record
>>>
>>> 1: get position from the index
>>> 2: read sizeof(Entry) from the file
>>> 3: if it's update, just overwrite the bytes, for delete set isdeleted
>>> flag (needs to be added to all entries)
>>>
>>> or reading all the records (just read the whole file as today).
>>
>> Sounds reasonable. However, I think the index should be a real index,
>> i.e. have a tree structure that can be walked down, not just a plain
>> array. If you have a 400 MB stat file, then you must have about 4
>> million tables, and you will not want to scan such a large array every
>> time you want to find an entry.
>
> I thought an array structure at first.
>
> But, for now, I think we should have a real index for the
> statistics data because we already have several index storages,
> and it will allow us to minimize read/write operations.
>
> BTW, what kind of index would be preferred for this purpose?
> btree or hash?
>
> If we use btree, do we need "range scan" thing on the statistics
> tables? I have no idea so far.
>

The thing I am interested in is range scan. That is the reason I wish to explore range tree usage here, maybe as a secondary index.

Regards,

Atri

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2013-09-05 04:23:19 Re: [9.4] Make full_page_writes only settable on server start?
Previous Message Satoshi Nagayasu 2013-09-05 03:28:33 Re: [rfc] overhauling pgstat.stat