Re: Priority table or Cache table

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Priority table or Cache table
Date: 2014-02-21 01:02:54
Message-ID: CAJrrPGf9c6MJ6BdPDADdOzyPYQ4mh3jHuDPivj-TWS62WcP-zQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 20, 2014 at 10:06 PM, Ashutosh Bapat <
ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:

> On Thu, Feb 20, 2014 at 10:23 AM, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com
> > wrote:
>
>> On Thu, Feb 20, 2014 at 2:26 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>wrote:
>>
>>> On Thu, Feb 20, 2014 at 6:24 AM, Haribabu Kommi
>>> <kommi(dot)haribabu(at)gmail(dot)com> wrote:
>>> > On Thu, Feb 20, 2014 at 11:38 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> >> > I want to propose a new feature called "priority table" or "cache
>>> >> > table".
>>> >> > This is same as regular table except the pages of these tables are
>>> >> > having
>>> >> > high priority than normal tables. These tables are very useful,
>>> where a
>>> >> > faster query processing on some particular tables is expected.
>>> >>
>>> >> Why exactly does the existing LRU behavior of shared buffers not do
>>> >> what you need?
>>> >
>>> >
>>> > Lets assume a database having 3 tables, which are accessed regularly.
>>> The
>>> > user is expecting a faster query results on one table.
>>> > Because of LRU behavior which is not happening some times.
>>>
>>> I think this will not be a problem for regularly accessed tables(pages),
>>> as per current algorithm they will get more priority before getting
>>> flushed out of shared buffer cache.
>>> Have you come across any such case where regularly accessed pages
>>> get lower priority than non-regularly accessed pages?
>>>
>>
>> Because of other regularly accessed tables, some times the table which
>> expects faster results is getting delayed.
>>
>
> The solution involving buffer pools partitions the buffer cache in
> separate pools explicitly. The way PostgreSQL buffer manager works, for a
> regular pattern table accesses the buffer cache automatically reaches a
> stable point where the number of buffers containing pages belonging to a
> particular table starts to stabilize. Thus at an equilibrium point for
> given access pattern, the buffer cache automatically gets partitioned by
> the tables, each using its share of buffers. So, solution using buffer
> pools seems useless.
>

I checked some of the performance reports on the oracle multiple buffer
pool concept, shown as there is an increase in cache hit ratio compared to
a single buffer pool.
After that only I proposed this split pool solution. I don't know how much
it really works for Postgresql. The performance report on oracle is
attached in the mail.

> PFA some scripts, which I used to verify the behaviour. The scripts create
> two tables, one large and other half it's size (buffer_usage_objects.sql).
> The other script contains few queries which will simulate a simple table
> access pattern by running select count(*) on these tables N times. The same
> script contains query of pg_buffercache view provided by pg_buffercache
> extension. This query counts the number of buffers uses by either of these
> tables. So, if you run three session in parallel, two querying either of
> the tables and the third taking snapshot of buffer usage per table, you
> would be able to see this partitioning.
>

Thanks for the scripts. I will check it.

> However it might be required for cases where user wants to control
>>> such behaviour and pass such hints through table level option or some
>>> other way to indicate that he wants more priority for certain tables
>>> irrespective
>>> of their usage w.r.t other tables.
>>>
>>> Now I think here important thing to find out is how much helpful it is
>>> for
>>> users or why do they want to control such behaviour even when Database
>>> already takes care of such thing based on access pattern.
>>>
>>
>> Yes it is useful in cases where the application always expects the faster
>> results whether the table is used regularly or not.
>>
>
> In such case, it might be valuable to see if we should play with the
> maximum usage parameter, which is set to 5 currently.
> 54 #define BM_MAX_USAGE_COUNT 5
>

This is the first solution which i have described in my first mail. Thanks,
I will check further into it.

Regards,
Hari Babu
Fujitsu Australia

Attachment Content-Type Size
oracle9i_buffer_pools.pdf application/pdf 51.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2014-02-21 02:38:00 Re: inherit support for foreign tables
Previous Message Peter Geoghegan 2014-02-21 00:51:36 Re: jsonb and nested hstore