Quick Links

per table random-page-cost?

Lists:	pgsql-hackers

From:	marcin mank <marcin(dot)mank(at)gmail(dot)com>
To:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	per table random-page-cost?
Date:	2009-10-19 21:08:40
Message-ID:	b1b9fac60910191408x4e6a639v15d14cfff415b5ec@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Currently random_page_cost is a GUC. I propose that this could be set per-table.

I think this is a good idea for widely-wanted planner hints. This way
You can say "I do NOT want this table to be index-scanned, because I
know it is not cached" by setting it`s random_page_cost to a large
value (an obviously You can do the other way around, when setting the
random_page_cost to 1 You say "I don`t care how You fetch the pages,
they are all in cache")

The value for the per-table setting could be inferred from
pg_stat(io)?.*tables . We could have a tool to suggest appropriate
values.

We could call it something like cached_percentage (and have the cost
of a random tuple fetch be inferred from the global random_page_cost,
seq_tuple_cost and the per-table cached_percentage). Then we could set
the global random_page_cost to a sane value like 200. Now one can
wonder why the planner works while having such blantantly unrealistic
values for random_page_cost :)

What do You think?

Greetings
Marcin Mank

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	marcin mank <marcin(dot)mank(at)gmail(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: per table random-page-cost?
Date:	2009-10-19 21:14:40
Message-ID:	603c8f070910191414g732ef2d7g10d012167c4127c6@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Oct 19, 2009 at 5:08 PM, marcin mank <marcin(dot)mank(at)gmail(dot)com> wrote:
> Currently random_page_cost is a GUC. I propose that this could be set per-table.
>
> I think this is a good idea for widely-wanted planner hints. This way
> You can say "I do NOT want this table to be index-scanned, because I
> know it is not cached" by setting it`s random_page_cost to a large
> value (an obviously You can do the other way around, when setting the
> random_page_cost to 1 You say "I don`t care how You fetch the pages,
> they are all in cache")
>
> The value for the per-table setting could be inferred from
> pg_stat(io)?.*tables . We could have a tool to suggest appropriate
> values.
>
> We could call it something like cached_percentage (and have the cost
> of a random tuple fetch be inferred from the global random_page_cost,
> seq_tuple_cost and the per-table cached_percentage). Then we could set
> the global random_page_cost to a sane value like 200. Now one can
> wonder why the planner works while having such blantantly unrealistic
> values for random_page_cost :)
>
> What do You think?

I've been thinking about this a bit, too. I've been wondering if it
might make sense to have a "random_page_cost" and "seq_page_cost"
setting for each TABLESPACE, to compensate for the fact that different
media might be faster or slower, and a percent-cached setting for each
table over top of that.

...Robert

From:	Greg Stark <gsstark(at)mit(dot)edu>
To:	marcin mank <marcin(dot)mank(at)gmail(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: per table random-page-cost?
Date:	2009-10-19 21:27:20
Message-ID:	407d949e0910191427md5c9a9au1efc0dd928658106@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Oct 19, 2009 at 2:08 PM, marcin mank <marcin(dot)mank(at)gmail(dot)com> wrote:
> Currently random_page_cost is a GUC. I propose that this could be set per-table.

Or per-tablespace.

Yes, I think there are a class of GUCs which describe the physical
attributes of the storage system which should be per-table or
per-tablespace. random_page_cost, sequential_page_cost,
effective_io_concurrency come to mind.

While this isn't a simple flag to change it does seem like a bit of a
SMOP. The GUC infrastructure stores these values in global variables
which the planner and other systems consult directly. They would
instead have to be made storage parameters which the planner and other
systems check on the appropriate table and default to the global GUC
if they're not set.

--
greg

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"marcin mank" <marcin(dot)mank(at)gmail(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc:	"PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: per table random-page-cost?
Date:	2009-10-19 21:54:47
Message-ID:	4ADC99D7020000250002BB4E@gw.wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> I've been wondering if it might make sense to have a
> "random_page_cost" and "seq_page_cost" setting for each TABLESPACE,
> to compensate for the fact that different media might be faster or
> slower, and a percent-cached setting for each table over top of
> that.

[after recovering from the initial cringing reaction...]

How about calculating an effective percentage based on other
information. effective_cache_size, along with relation and database
size, come to mind. How about the particular index being considered
for the plan? Of course, you might have to be careful about working
in TOAST table size for a particular query, based on the columns
retrieved.

I have no doubt that there would be some major performance regressions
in the first cut of anything like this, for at least *some* queries.
The toughest part of this might be to get adequate testing to tune it
for a wide enough variety of real-life situations.

-Kevin

From:	marcin mank <marcin(dot)mank(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: per table random-page-cost?
Date:	2009-10-19 22:17:55
Message-ID:	b1b9fac60910191517w137a0832h544ecd211697fb4e@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

> I've been thinking about this a bit, too. I've been wondering if it
> might make sense to have a "random_page_cost" and "seq_page_cost"
> setting for each TABLESPACE, to compensate for the fact that different
> media might be faster or slower, and a percent-cached setting for each
> table over top of that.
>

I thought about making it per-table, but realistically I think most
people don`t use tablespaces now. I would not want to be telling
people "to be able to hint the planner to (not) index-scan the table,
You must move it to a separate tablespace".

A global default, a per-tablespace default overriding it, and a
per-table value overriding them both seems like over-engineering to
me.

Greetings
Marcin

From:	marcin mank <marcin(dot)mank(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: per table random-page-cost?
Date:	2009-10-19 22:19:14
Message-ID:	b1b9fac60910191519h21737358k38fc569afdf1efc8@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

> I thought about making it per-table***space***, but realistically I

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	marcin mank <marcin(dot)mank(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: per table random-page-cost?
Date:	2009-10-19 22:33:35
Message-ID:	12816.1255991615@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

marcin mank <marcin(dot)mank(at)gmail(dot)com> writes:
> I thought about making it per-table, but realistically I think most
> people don`t use tablespaces now. I would not want to be telling
> people "to be able to hint the planner to (not) index-scan the table,
> You must move it to a separate tablespace".

Per-table is not physically sensible. Per-tablespace has some rationale
to it.

regards, tom lane

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	marcin mank <marcin(dot)mank(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: per table random-page-cost?
Date:	2009-10-19 22:45:33
Message-ID:	4ADCEC0D.40808@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

marcin mank wrote:
>> I've been thinking about this a bit, too. I've been wondering if it
>> might make sense to have a "random_page_cost" and "seq_page_cost"
>> setting for each TABLESPACE, to compensate for the fact that different
>> media might be faster or slower, and a percent-cached setting for each
>> table over top of that.
>>
>>
>
> I thought about making it per-table, but realistically I think most
> people don`t use tablespaces now. I would not want to be telling
> people "to be able to hint the planner to (not) index-scan the table,
> You must move it to a separate tablespace".
>

This is just plain wrong, in my experience. *Every* large installation I
deal with uses tablespaces.

This proposal is just "hints by the back door", ISTM. As Tom says, there
is a justification for having it on tablespaces but not on individual
tables.

If you want to argue for full blown planner hints, that's a whole other
story. Have you read the previous debates on the subject?

cheers

From:	marcin mank <marcin(dot)mank(at)gmail(dot)com>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: per table random-page-cost?
Date:	2009-10-19 23:04:04
Message-ID:	b1b9fac60910191604r59cc36d3tae146cce6ec1c439@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

> This proposal is just "hints by the back door", ISTM. As Tom says, there is
> a justification for having it on tablespaces but not on individual tables.

If the parameter is defined as "the chance that a page is in cache"
there is very real physical meaning to it. And this is per-table, not
per-tablespace. A "users" table will likely be fetched from cache all
the time, while a "billing_records" table will be fetched mostly from
disk.

Greetings
Marcin

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	marcin mank <marcin(dot)mank(at)gmail(dot)com>
Cc:	Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: per table random-page-cost?
Date:	2009-10-19 23:21:23
Message-ID:	13775.1255994483@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

marcin mank <marcin(dot)mank(at)gmail(dot)com> writes:
>> This proposal is just "hints by the back door", ISTM. As Tom says, there is
>> a justification for having it on tablespaces but not on individual tables.

> If the parameter is defined as "the chance that a page is in cache"
> there is very real physical meaning to it.

We have no such parameter...

regards, tom lane

From:	Greg Stark <gsstark(at)mit(dot)edu>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: per table random-page-cost?
Date:	2009-10-19 23:29:05
Message-ID:	407d949e0910191629w7607e384g24535e3d760a811c@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Oct 19, 2009 at 4:21 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> marcin mank <marcin(dot)mank(at)gmail(dot)com> writes:
>>> This proposal is just "hints by the back door", ISTM. As Tom says, there is
>>> a justification for having it on tablespaces but not on individual tables.
>
>> If the parameter is defined as "the chance that a page is in cache"
>> there is very real physical meaning to it.
>
> We have no such parameter...

And we want our parameters to be things the DBA has a chance of being
able to estimate. How would you come up with sensible figures for this
hypothetical parameter?

--
greg

From:	Greg Stark <gsstark(at)mit(dot)edu>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	marcin mank <marcin(dot)mank(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: per table random-page-cost?
Date:	2009-10-19 23:39:16
Message-ID:	407d949e0910191639k6bc9d71bu2c5638a260ce13a3@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Oct 19, 2009 at 2:54 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> How about calculating an effective percentage based on other
> information. effective_cache_size, along with relation and database
> size, come to mind.

I think previous proposals for this have fallen down when you actually
try to work out a formula for this. The problem is that you could have
a table which is much smaller than effective_cache_size but is never
in cache due to it being one of many such tables.

I think it would still be good to have some naive kind of heuristic
here as long as it's fairly predictable for DBAs.

But the long-term strategy here I think is to actually have some way
to measure the real cache hit rate on a per-table basis. Whether it's
by timing i/o operations, programmatic access to dtrace, or some other
kind of os interface, if we could know the real cache hit rate it
would be very helpful.

Perhaps we could extrapolate from the shared buffer cache percentage.
If there's a moderately high percentage in shared buffers then it
seems like a reasonable supposition to assume the filesystem cache
would have a similar distribution.

--
greg

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	marcin mank <marcin(dot)mank(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: per table random-page-cost?
Date:	2009-10-20 00:29:33
Message-ID:	603c8f070910191729x2547ddb5x89218e471c1bd0@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Oct 19, 2009 at 5:54 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> I've been wondering if it might make sense to have a
>> "random_page_cost" and "seq_page_cost" setting for each TABLESPACE,
>> to compensate for the fact that different media might be faster or
>> slower, and a percent-cached setting for each table over top of
>> that.
>
> [after recovering from the initial cringing reaction...]
>
> How about calculating an effective percentage based on other
> information. effective_cache_size, along with relation and database
> size, come to mind. How about the particular index being considered
> for the plan? Of course, you might have to be careful about working
> in TOAST table size for a particular query, based on the columns
> retrieved.

I think that a per-tablespace page cost should be set by the DBA, same
as we do with global page-costs now.

OTOH, I think that a per-relation percent-in-cache should be
automatically calculated by the database (somehow) and the DBA should
have an option to override in case the database does the wrong thing.
I gave a lightning talk on this topic at PGcon.

> I have no doubt that there would be some major performance regressions
> in the first cut of anything like this, for at least *some* queries.
> The toughest part of this might be to get adequate testing to tune it
> for a wide enough variety of real-life situations.

Agreed.

...Robert

From:	Jeff Davis <pgsql(at)j-davis(dot)com>
To:	Greg Stark <gsstark(at)mit(dot)edu>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: per table random-page-cost?
Date:	2009-10-20 00:34:20
Message-ID:	1255998860.31947.189.camel@monkey-cat.sm.truviso.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, 2009-10-19 at 16:39 -0700, Greg Stark wrote:
> But the long-term strategy here I think is to actually have some way
> to measure the real cache hit rate on a per-table basis. Whether it's
> by timing i/o operations, programmatic access to dtrace, or some other
> kind of os interface, if we could know the real cache hit rate it
> would be very helpful.

Maybe it would be simpler to just get the high-order bit: is this table
likely to be completely in cache (shared buffers or os buffer cache), or
not?

The lower cache hit ratios are uninteresting: the performance difference
between 1% and 50% is only a factor of two. The higher cache hit ratios
that are lower than "almost 100%" seem unlikely: what kind of scenario
would involve a stable 90% cache hit ratio for a table?

Regards,
Jeff Davis

From:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To:	Greg Stark <gsstark(at)mit(dot)edu>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: per table random-page-cost?
Date:	2009-10-20 02:21:24
Message-ID:	f67928030910191921r5a9a3d7en58847920d1a89bd8@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Oct 19, 2009 at 4:29 PM, Greg Stark <gsstark(at)mit(dot)edu> wrote:
> On Mon, Oct 19, 2009 at 4:21 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> marcin mank <marcin(dot)mank(at)gmail(dot)com> writes:
>>>> This proposal is just "hints by the back door", ISTM. As Tom says, there is
>>>> a justification for having it on tablespaces but not on individual tables.
>>
>>> If the parameter is defined as "the chance that a page is in cache"
>>> there is very real physical meaning to it.
>>
>> We have no such parameter...
>
>
> And we want our parameters to be things the DBA has a chance of being
> able to estimate.

Do the current parameters meet that standard? When setting
seq_page_cost now, don't people have a lot of "Well, we're about this
likely to find it in the cache anyway" built into their settings?

Jeff

From:	marcin mank <marcin(dot)mank(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: per table random-page-cost?
Date:	2009-10-20 08:31:42
Message-ID:	b1b9fac60910200131j2e9901b5n8f11adab40f199b0@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Oct 20, 2009 at 1:21 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> If the parameter is defined as "the chance that a page is in cache"
>> there is very real physical meaning to it.
>
> We have no such parameter...

What a simple person like me would think would work is:

- call the parameter "cached_probability".
- Invent a way to store it (I`d actually try to do it the exact same
way recent "alter table set statistics distinct" does it)

a) less radical idea: replace all usage of random_page_cost with
seq_page_cost * cached_probability + random_page_cost *
(1-cached_probability)

b) more radical idea:
b1) invent a new GUC: cached_page_cost
b2) replace all usage of seq_page_cost with cached_page_cost *
cached_probability + seq_page_cost * (1-cached_probability)
b3) replace all usage of random_page_cost with cached_page_cost *
cached_probability + random_page_cost * (1-cached_probability)

> How would you come up with sensible figures for this hypothetical parameter?

select schemaname,relname,heap_blks_hit /
cast(heap_blks_read+heap_blks_hit+1 as float)
from pg_statio_all_tables

Would be a nice starting point.

Greetings
Marcin

From:	Cédric Villemain <cedric(dot)villemain(at)dalibo(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>
Subject:	Re: per table random-page-cost?
Date:	2009-10-22 15:03:33
Message-ID:	200910221703.33989.cedric.villemain@dalibo.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Le lundi 19 octobre 2009 23:14:40, Robert Haas a écrit :
> On Mon, Oct 19, 2009 at 5:08 PM, marcin mank <marcin(dot)mank(at)gmail(dot)com> wrote:
> > Currently random_page_cost is a GUC. I propose that this could be set
> > per-table.
> >
> > I think this is a good idea for widely-wanted planner hints. This way
> > You can say "I do NOT want this table to be index-scanned, because I
> > know it is not cached" by setting it`s random_page_cost to a large
> > value (an obviously You can do the other way around, when setting the
> > random_page_cost to 1 You say "I don`t care how You fetch the pages,
> > they are all in cache")
> >
> > The value for the per-table setting could be inferred from
> > pg_stat(io)?.*tables . We could have a tool to suggest appropriate
> > values.
> >
> > We could call it something like cached_percentage (and have the cost
> > of a random tuple fetch be inferred from the global random_page_cost,
> > seq_tuple_cost and the per-table cached_percentage). Then we could set
> > the global random_page_cost to a sane value like 200. Now one can
> > wonder why the planner works while having such blantantly unrealistic
> > values for random_page_cost :)
> >
> > What do You think?
>
> I've been thinking about this a bit, too. I've been wondering if it
> might make sense to have a "random_page_cost" and "seq_page_cost"
> setting for each TABLESPACE, to compensate for the fact that different
> media might be faster or slower, and a percent-cached setting for each
> table over top of that.

At least settings by TABLESPACE should exists. I totaly agree with that.

>
> ...Robert
>

--
Cédric Villemain
Administrateur de Base de Données
Cel: +33 (0)6 74 15 56 53
http://dalibo.com - http://dalibo.org

From:	Cédric Villemain <cedric(dot)villemain(at)dalibo(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Cc:	Greg Stark <gsstark(at)mit(dot)edu>, marcin mank <marcin(dot)mank(at)gmail(dot)com>
Subject:	Re: per table random-page-cost?
Date:	2009-10-22 15:16:31
Message-ID:	200910221716.36366.cedric.villemain@dalibo.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Le lundi 19 octobre 2009 23:27:20, Greg Stark a écrit :
> On Mon, Oct 19, 2009 at 2:08 PM, marcin mank <marcin(dot)mank(at)gmail(dot)com> wrote:
> > Currently random_page_cost is a GUC. I propose that this could be set
> > per-table.
>
> Or per-tablespace.
>
> Yes, I think there are a class of GUCs which describe the physical
> attributes of the storage system which should be per-table or
> per-tablespace. random_page_cost, sequential_page_cost,
> effective_io_concurrency come to mind.

and, perhaps effective_cache_size.

You can have situation where you don't want some tables go to OS memory (you
can disabled that at filesystem level, ... l'd like to be able to do that at
postgres level but it is another point)

So you put those tables in a separate tablespace, and tell postgresql that the
effective_cache_size is 0 (for this tablespace), up to postgres to do the right
thing with that ;)

>
> While this isn't a simple flag to change it does seem like a bit of a
> SMOP. The GUC infrastructure stores these values in global variables
> which the planner and other systems consult directly. They would
> instead have to be made storage parameters which the planner and other
> systems check on the appropriate table and default to the global GUC
> if they're not set.
>

--
Cédric Villemain
Administrateur de Base de Données
Cel: +33 (0)6 74 15 56 53
http://dalibo.com - http://dalibo.org

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Cédric Villemain <cedric(dot)villemain(at)dalibo(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org, Greg Stark <gsstark(at)mit(dot)edu>, marcin mank <marcin(dot)mank(at)gmail(dot)com>
Subject:	Re: per table random-page-cost?
Date:	2009-10-22 15:33:48
Message-ID:	603c8f070910220833t79814e05se5d5d6eb035bc750@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Oct 22, 2009 at 11:16 AM, Cédric Villemain
<cedric(dot)villemain(at)dalibo(dot)com> wrote:
> Le lundi 19 octobre 2009 23:27:20, Greg Stark a écrit :
>> On Mon, Oct 19, 2009 at 2:08 PM, marcin mank <marcin(dot)mank(at)gmail(dot)com> wrote:
>> > Currently random_page_cost is a GUC. I propose that this could be set
>> > per-table.
>>
>> Or per-tablespace.
>>
>> Yes, I think there are a class of GUCs which describe the physical
>> attributes of the storage system which should be per-table or
>> per-tablespace. random_page_cost, sequential_page_cost,
>> effective_io_concurrency come to mind.
>
> and, perhaps effective_cache_size.
>
> You can have situation where you don't want some tables go to OS memory (you
> can disabled that at filesystem level, ... l'd like to be able to do that at
> postgres level but it is another point)
>
> So you put those tables in a separate tablespace, and tell postgresql that the
> effective_cache_size is 0 (for this tablespace), up to postgres to do the right
> thing with that ;)

Why would you ever want to set effective_cache_size to 0?

...Robert

From:	Greg Stark <gsstark(at)mit(dot)edu>
To:	Cédric Villemain <cedric(dot)villemain(at)dalibo(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org, marcin mank <marcin(dot)mank(at)gmail(dot)com>
Subject:	Re: per table random-page-cost?
Date:	2009-10-22 18:01:23
Message-ID:	407d949e0910221101te247c85me4e1027d8090d405@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Oct 22, 2009 at 8:16 AM, Cédric Villemain
<cedric(dot)villemain(at)dalibo(dot)com> wrote:
> You can have situation where you don't want some tables go to OS memory

I don't think this is a configuration we want to cater for. The
sysadmin shouldn't be required to understand the i/o pattern of
postgres. He or she cannot know whether the database will want to
access the same blocks twice for internal algorithms that isn't
visible from the user point of view.

The scenarios where you might want to do this would be if you know
there are tables which are accessed very randomly with no locality and
very low cache hit rates. I think the direction we want to head is
towards making sure the cache manager is automatically resistant to
such data.

There is another use case which perhaps needs to be addressed: if the
user has some queries which are very latency sensitive and others
which are not latency sensitive. In that case it might be very
important to keep the pages of data used by the high priority queries
in the cache. That's something we should have a high level abstract
interface for, not depend on low level system features.

--
greg

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Greg Stark <gsstark(at)mit(dot)edu>
Cc:	Cédric Villemain <cedric(dot)villemain(at)dalibo(dot)com>, pgsql-hackers(at)postgresql(dot)org, marcin mank <marcin(dot)mank(at)gmail(dot)com>
Subject:	Re: per table random-page-cost?
Date:	2009-10-22 18:28:44
Message-ID:	4AE0A45C.3060805@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

All,

Wouldn't per *tablespace* costs make more sense?

--Josh

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	<cedric(dot)villemain(at)dalibo(dot)com>,"Greg Stark" <gsstark(at)mit(dot)edu>
Cc:	"marcin mank" <marcin(dot)mank(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: per table random-page-cost?
Date:	2009-10-22 18:41:23
Message-ID:	4AE06103020000250002BDF6@gw.wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> wrote:

> There is another use case which perhaps needs to be addressed: if
> the user has some queries which are very latency sensitive and
> others which are not latency sensitive.

Yes. Some products allow you to create a named cache and bind
particular objects to it. This can be used both to keep a large
object with a low cache hit rate from pushing other things out of the
cache or to create a pseudo "memory resident" set of objects by
binding them to a cache which is sized a little bigger than those
objects. I don't know if you have any other suggestions for this
problem, but the named cache idea didn't go over well last time it was
suggested.

In all fairness, PostgreSQL does a good enough job in general that I
haven't missed this feature nearly as much as I thought I would; and
its absence means one less thing to worry about keeping properly
tuned.

-Kevin

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	Greg Stark <gsstark(at)mit(dot)edu>, Cédric Villemain <cedric(dot)villemain(at)dalibo(dot)com>, pgsql-hackers(at)postgresql(dot)org, marcin mank <marcin(dot)mank(at)gmail(dot)com>
Subject:	Re: per table random-page-cost?
Date:	2009-10-22 18:48:46
Message-ID:	603c8f070910221148u67493436o260a7502e948b554@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Oct 22, 2009 at 2:28 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> All,
>
> Wouldn't per *tablespace* costs make more sense?
>
> --Josh

Yes, we already had several votes in favor of that approach. See upthread.

...Robert

From:	Greg Stark <stark(at)mit(dot)edu>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	"<cedric(dot)villemain(at)dalibo(dot)com>" <cedric(dot)villemain(at)dalibo(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: per table random-page-cost?
Date:	2009-10-22 20:25:22
Message-ID:	A190D5B5-413E-4C78-AF03-5A3E328FD41D@mit.edu
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Well I think we need sone way to accomplish the same high level goal
of guaranteeing response times for latency-critical queries.

However my point is that cache policy is an internal implementation
detail we don't want to expose in a user interface.

--
Greg

On 2009-10-22, at 11:41 AM, "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov
> wrote:

> Greg Stark <gsstark(at)mit(dot)edu> wrote:
>
>> There is another use case which perhaps needs to be addressed: if
>> the user has some queries which are very latency sensitive and
>> others which are not latency sensitive.
>
> Yes. Some products allow you to create a named cache and bind
> particular objects to it. This can be used both to keep a large
> object with a low cache hit rate from pushing other things out of the
> cache or to create a pseudo "memory resident" set of objects by
> binding them to a cache which is sized a little bigger than those
> objects. I don't know if you have any other suggestions for this
> problem, but the named cache idea didn't go over well last time it was
> suggested.
>
> In all fairness, PostgreSQL does a good enough job in general that I
> haven't missed this feature nearly as much as I thought I would; and
> its absence means one less thing to worry about keeping properly
> tuned.
>
> -Kevin

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Cédric Villemain <cedric(dot)villemain(at)dalibo(dot)com>, pgsql-hackers(at)postgresql(dot)org, Greg Stark <gsstark(at)mit(dot)edu>, marcin mank <marcin(dot)mank(at)gmail(dot)com>
Subject:	Re: per table random-page-cost?
Date:	2009-10-22 23:08:15
Message-ID:	1256252895.2821.9.camel@jd-desktop.unknown.charter.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, 2009-10-22 at 11:33 -0400, Robert Haas wrote:
> On Thu, Oct 22, 2009 at 11:16 AM, Cédric Villemain
> <cedric(dot)villemain(at)dalibo(dot)com> wrote:
> > Le lundi 19 octobre 2009 23:27:20, Greg Stark a écrit :
> >> On Mon, Oct 19, 2009 at 2:08 PM, marcin mank <marcin(dot)mank(at)gmail(dot)com> wrote:
> >> > Currently random_page_cost is a GUC. I propose that this could be set
> >> > per-table.
> >>
> >> Or per-tablespace.
> >>
> >> Yes, I think there are a class of GUCs which describe the physical
> >> attributes of the storage system which should be per-table or
> >> per-tablespace. random_page_cost, sequential_page_cost,
> >> effective_io_concurrency come to mind.
> >
> > and, perhaps effective_cache_size.
> >
> > You can have situation where you don't want some tables go to OS memory (you
> > can disabled that at filesystem level, ... l'd like to be able to do that at
> > postgres level but it is another point)
> >
> > So you put those tables in a separate tablespace, and tell postgresql that the
> > effective_cache_size is 0 (for this tablespace), up to postgres to do the right
> > thing with that ;)
>
> Why would you ever want to set effective_cache_size to 0?

I think this is a misunderstanding of how effective_cache_size works. I
can't think of any reason to do that. I could see a reason to tell the
OS to not throw a relation into cache but that is a different thing.

Joshua D. Drake

>
> ...Robert
>

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
If the world pushes look it in the eye and GRR. Then push back harder. - Salamander

From:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To:	Greg Stark <gsstark(at)mit(dot)edu>
Cc:	Cédric Villemain <cedric(dot)villemain(at)dalibo(dot)com>, pgsql-hackers(at)postgresql(dot)org, marcin mank <marcin(dot)mank(at)gmail(dot)com>
Subject:	Re: per table random-page-cost?
Date:	2009-10-23 01:38:06
Message-ID:	20091023013805.GB2240@alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Stark escribió:

> There is another use case which perhaps needs to be addressed: if the
> user has some queries which are very latency sensitive and others
> which are not latency sensitive. In that case it might be very
> important to keep the pages of data used by the high priority queries
> in the cache. That's something we should have a high level abstract
> interface for, not depend on low level system features.

Yeah.

I wonder if the right thing for this is to mark the objects (pages), or
the queries, as needing special attention. If you mark the queries,
then perhaps they could behave slightly differently like adding +2 or so
to buffer usage count instead of +1, so that they take longer than a
normal buffer in getting evicted. This way you don't force the admin to
figure out what's the right size ratio for different named caches, etc.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

From:	Cédric Villemain <cedric(dot)villemain(at)dalibo(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org, jd(at)commandprompt(dot)com
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, marcin mank <marcin(dot)mank(at)gmail(dot)com>
Subject:	Re: per table random-page-cost?
Date:	2009-10-23 09:23:31
Message-ID:	200910231123.36253.cedric.villemain@dalibo.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Le vendredi 23 octobre 2009 01:08:15, Joshua D. Drake a écrit :
> On Thu, 2009-10-22 at 11:33 -0400, Robert Haas wrote:
> > On Thu, Oct 22, 2009 at 11:16 AM, Cédric Villemain
> >
> > <cedric(dot)villemain(at)dalibo(dot)com> wrote:
> > > Le lundi 19 octobre 2009 23:27:20, Greg Stark a écrit :
> > >> On Mon, Oct 19, 2009 at 2:08 PM, marcin mank <marcin(dot)mank(at)gmail(dot)com>
wrote:
> > >> > Currently random_page_cost is a GUC. I propose that this could be
> > >> > set per-table.
> > >>
> > >> Or per-tablespace.
> > >>
> > >> Yes, I think there are a class of GUCs which describe the physical
> > >> attributes of the storage system which should be per-table or
> > >> per-tablespace. random_page_cost, sequential_page_cost,
> > >> effective_io_concurrency come to mind.
> > >
> > > and, perhaps effective_cache_size.
> > >
> > > You can have situation where you don't want some tables go to OS memory
> > > (you can disabled that at filesystem level, ... l'd like to be able to
> > > do that at postgres level but it is another point)
> > >
> > > So you put those tables in a separate tablespace, and tell postgresql
> > > that the effective_cache_size is 0 (for this tablespace), up to
> > > postgres to do the right thing with that ;)
> >
> > Why would you ever want to set effective_cache_size to 0?
>
> I think this is a misunderstanding of how effective_cache_size works. I
> can't think of any reason to do that. I could see a reason to tell the
> OS to not throw a relation into cache but that is a different thing.

Well the effective_cache_size in this context is OS cache memory (0 in my case)
+ estimation of shared_buffer.. ah so DBA should estimate the amount in the
shared_buffer only, ok.

So consider effective_cache_size = 0 + what pg_buffer_cache will tell.

My case is a table containing 29 GB of bytea in a database of 52 GB. Every row
on the 29GB table is grab only few times. And it will just renew OS cache
memory every time (the server have only 8GB of ram).
So when I remove this table (not the index) from the OS cache memory, I keep
more interesting blocks in the OS cache memory.

And disk + raid are quick enought to bypass the OS cache memory for this
tablespace.

Are things a bit clearer and usage not so silly ?

>
> Joshua D. Drake
>
> > ...Robert
>

--
Cédric Villemain
Administrateur de Base de Données
Cel: +33 (0)6 74 15 56 53
http://dalibo.com - http://dalibo.org

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Cédric Villemain <cedric(dot)villemain(at)dalibo(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org, jd(at)commandprompt(dot)com, Greg Stark <gsstark(at)mit(dot)edu>, marcin mank <marcin(dot)mank(at)gmail(dot)com>
Subject:	Re: per table random-page-cost?
Date:	2009-10-23 12:23:09
Message-ID:	603c8f070910230523x2c7a772aifc56039847c3f177@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Oct 23, 2009 at 5:23 AM, Cédric Villemain
<cedric(dot)villemain(at)dalibo(dot)com> wrote:
> Le vendredi 23 octobre 2009 01:08:15, Joshua D. Drake a écrit :
>> On Thu, 2009-10-22 at 11:33 -0400, Robert Haas wrote:
>> > On Thu, Oct 22, 2009 at 11:16 AM, Cédric Villemain
>> >
>> > <cedric(dot)villemain(at)dalibo(dot)com> wrote:
>> > > Le lundi 19 octobre 2009 23:27:20, Greg Stark a écrit :
>> > >> On Mon, Oct 19, 2009 at 2:08 PM, marcin mank <marcin(dot)mank(at)gmail(dot)com>
> wrote:
>> > >> > Currently random_page_cost is a GUC. I propose that this could be
>> > >> > set per-table.
>> > >>
>> > >> Or per-tablespace.
>> > >>
>> > >> Yes, I think there are a class of GUCs which describe the physical
>> > >> attributes of the storage system which should be per-table or
>> > >> per-tablespace. random_page_cost, sequential_page_cost,
>> > >> effective_io_concurrency come to mind.
>> > >
>> > > and, perhaps effective_cache_size.
>> > >
>> > > You can have situation where you don't want some tables go to OS memory
>> > > (you can disabled that at filesystem level, ... l'd like to be able to
>> > > do that at postgres level but it is another point)
>> > >
>> > > So you put those tables in a separate tablespace, and tell postgresql
>> > > that the effective_cache_size is 0 (for this tablespace), up to
>> > > postgres to do the right thing with that ;)
>> >
>> > Why would you ever want to set effective_cache_size to 0?
>>
>> I think this is a misunderstanding of how effective_cache_size works. I
>> can't think of any reason to do that. I could see a reason to tell the
>> OS to not throw a relation into cache but that is a different thing.
>
> Well the effective_cache_size in this context is OS cache memory (0 in my case)
> + estimation of shared_buffer.. ah so DBA should estimate the amount in the
> shared_buffer only, ok.
>
> So consider effective_cache_size = 0 + what pg_buffer_cache will tell.
>
> My case is a table containing 29 GB of bytea in a database of 52 GB. Every row
> on the 29GB table is grab only few times. And it will just renew OS cache
> memory every time (the server have only 8GB of ram).
> So when I remove this table (not the index) from the OS cache memory, I keep
> more interesting blocks in the OS cache memory.
>
> And disk + raid are quick enought to bypass the OS cache memory for this
> tablespace.
>
>
> Are things a bit clearer and usage not so silly ?

Well, I think you're vastly overestimating the power of
effective_cache_size. effective_cache_size changes the planner's
estimation of how likely a repeated partial index scan is to find the
same block in cache. So it only affects
nested-loop-with-inner-indexscan plans, and if effective_cache_size is
set to a value larger than the size of the index (or maybe the
relation, I'm too lazy to go reread the code right now), one value is
as good as another. For a typical user, I think you could set
effective_cache_size to, say, a terabyte, and it wouldn't make a bit
of difference. Heck, why not 2TB.

As far as I can see, the only possible value of setting this knob to a
value other than positive-infinity is that if you have a huge dataset
that's not close to fitting in memory, this might cause the planner to
pick a merge join over a nested loop with inner indexscan, which might
be better if it makes the I/O sequential rather than random. Anyone
think I'm a pessimist?

...Robert

From:	Cédric Villemain <cedric(dot)villemain(at)dalibo(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, jd(at)commandprompt(dot)com, Greg Stark <gsstark(at)mit(dot)edu>, marcin mank <marcin(dot)mank(at)gmail(dot)com>
Subject:	Re: per table random-page-cost?
Date:	2009-10-23 13:23:01
Message-ID:	200910231523.01302.cedric.villemain@dalibo.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Le vendredi 23 octobre 2009 14:23:09, Robert Haas a écrit :
> On Fri, Oct 23, 2009 at 5:23 AM, Cédric Villemain
>
> <cedric(dot)villemain(at)dalibo(dot)com> wrote:
> > Le vendredi 23 octobre 2009 01:08:15, Joshua D. Drake a écrit :
> >> On Thu, 2009-10-22 at 11:33 -0400, Robert Haas wrote:
> >> > On Thu, Oct 22, 2009 at 11:16 AM, Cédric Villemain
> >> >
> >> > <cedric(dot)villemain(at)dalibo(dot)com> wrote:
> >> > > Le lundi 19 octobre 2009 23:27:20, Greg Stark a écrit :
> >> > >> On Mon, Oct 19, 2009 at 2:08 PM, marcin mank
> >> > >> <marcin(dot)mank(at)gmail(dot)com>
> >
> > wrote:
> >> > >> > Currently random_page_cost is a GUC. I propose that this could be
> >> > >> > set per-table.
> >> > >>
> >> > >> Or per-tablespace.
> >> > >>
> >> > >> Yes, I think there are a class of GUCs which describe the physical
> >> > >> attributes of the storage system which should be per-table or
> >> > >> per-tablespace. random_page_cost, sequential_page_cost,
> >> > >> effective_io_concurrency come to mind.
> >> > >
> >> > > and, perhaps effective_cache_size.
> >> > >
> >> > > You can have situation where you don't want some tables go to OS
> >> > > memory (you can disabled that at filesystem level, ... l'd like to
> >> > > be able to do that at postgres level but it is another point)
> >> > >
> >> > > So you put those tables in a separate tablespace, and tell
> >> > > postgresql that the effective_cache_size is 0 (for this tablespace),
> >> > > up to postgres to do the right thing with that ;)
> >> >
> >> > Why would you ever want to set effective_cache_size to 0?
> >>
> >> I think this is a misunderstanding of how effective_cache_size works. I
> >> can't think of any reason to do that. I could see a reason to tell the
> >> OS to not throw a relation into cache but that is a different thing.
> >
> > Well the effective_cache_size in this context is OS cache memory (0 in my
> > case) + estimation of shared_buffer.. ah so DBA should estimate the
> > amount in the shared_buffer only, ok.
> >
> > So consider effective_cache_size = 0 + what pg_buffer_cache will tell.
> >
> > My case is a table containing 29 GB of bytea in a database of 52 GB.
> > Every row on the 29GB table is grab only few times. And it will just
> > renew OS cache memory every time (the server have only 8GB of ram).
> > So when I remove this table (not the index) from the OS cache memory, I
> > keep more interesting blocks in the OS cache memory.
> >
> > And disk + raid are quick enought to bypass the OS cache memory for this
> > tablespace.
> >
> >
> > Are things a bit clearer and usage not so silly ?
>
> Well, I think you're vastly overestimating the power of
> effective_cache_size. effective_cache_size changes the planner's
> estimation of how likely a repeated partial index scan is to find the
> same block in cache. So it only affects
> nested-loop-with-inner-indexscan plans, and if effective_cache_size is
> set to a value larger than the size of the index (or maybe the
> relation, I'm too lazy to go reread the code right now), one value is
> as good as another. For a typical user, I think you could set
> effective_cache_size to, say, a terabyte, and it wouldn't make a bit
> of difference. Heck, why not 2TB.

Ok. I don't care too much on this parameter so.
As we were talking about some parameters that can be tablespace specific, I
thought this one can have different values too.

>
> As far as I can see, the only possible value of setting this knob to a
> value other than positive-infinity is that if you have a huge dataset
> that's not close to fitting in memory, this might cause the planner to
> pick a merge join over a nested loop with inner indexscan, which might
> be better if it makes the I/O sequential rather than random. Anyone
> think I'm a pessimist?
>
> ...Robert
>

--
Cédric Villemain
Administrateur de Base de Données
Cel: +33 (0)6 74 15 56 53
http://dalibo.com - http://dalibo.org

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Cédric Villemain <cedric(dot)villemain(at)dalibo(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org, jd(at)commandprompt(dot)com, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, marcin mank <marcin(dot)mank(at)gmail(dot)com>
Subject:	Re: per table random-page-cost?
Date:	2009-10-23 23:04:19
Message-ID:	4AE23673.7030608@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Cedric,

> ase is a table containing 29 GB of bytea in a database of 52 GB. Every row
> on the 29GB table is grab only few times. And it will just renew OS cache
> memory every time (the server have only 8GB of ram).
> So when I remove this table (not the index) from the OS cache memory, I keep
> more interesting blocks in the OS cache memory.

effective_cache_size doesn't control what gets cached, it just tells the
planner about it.

Now, if we had an OS which could be convinced to handle caching
differently for different physical devices, then I could see wanting
this setting to be per-tablespace. For example, it would make a lot of
sense not to FS-cache any data which is on a ramdisk or superfast SSD
array. The same with archive data which you expected to be slow and
infrequently accessed on a NAS device. If your OS can do that, while
caching data from other sources, then it would make sense.

However, I don't know any current OS which allows for this. Does anyone
else?

--Josh Berkus

From:	Stefan Moeding <pgsql(at)moeding(dot)net>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: per table random-page-cost?
Date:	2009-10-24 06:00:54
Message-ID:	8763a58h61.fsf@esprit.moeding.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi!

Josh Berkus writes:

> Now, if we had an OS which could be convinced to handle caching
> differently for different physical devices, then I could see wanting
> this setting to be per-tablespace. For example, it would make a lot of
> sense not to FS-cache any data which is on a ramdisk or superfast SSD
> array. The same with archive data which you expected to be slow and
> infrequently accessed on a NAS device. If your OS can do that, while
> caching data from other sources, then it would make sense.
>
> However, I don't know any current OS which allows for this. Does anyone
> else?

Isn't bypassing the buffer cache exactly what direct I/O is about?
Solaris UFS has a "forcedirectio" mount option, AIX JFS2 calls it "dio"
and Veritas VxFS uses the "convosync=direct" option to disable caching
the content of the filesystem.

--
Stefan

From:	Cédric Villemain <cedric(dot)villemain(at)dalibo(dot)com>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org, jd(at)commandprompt(dot)com, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, marcin mank <marcin(dot)mank(at)gmail(dot)com>
Subject:	Re: per table random-page-cost?
Date:	2009-10-26 17:19:44
Message-ID:	200910261819.45107.cedric.villemain@dalibo.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Le samedi 24 octobre 2009 01:04:19, Josh Berkus a écrit :
> Cedric,
>
> > ase is a table containing 29 GB of bytea in a database of 52 GB. Every
> > row on the 29GB table is grab only few times. And it will just renew OS
> > cache memory every time (the server have only 8GB of ram).
> > So when I remove this table (not the index) from the OS cache memory, I
> > keep more interesting blocks in the OS cache memory.
>
> effective_cache_size doesn't control what gets cached, it just tells the
> planner about it.
>
> Now, if we had an OS which could be convinced to handle caching
> differently for different physical devices, then I could see wanting
> this setting to be per-tablespace. For example, it would make a lot of
> sense not to FS-cache any data which is on a ramdisk or superfast SSD
> array. The same with archive data which you expected to be slow and
> infrequently accessed on a NAS device. If your OS can do that, while
> caching data from other sources, then it would make sense.
>
> However, I don't know any current OS which allows for this. Does anyone
> else?

Isn't it what "fadvise -dontneed" let you do ?

Josh, I talk about effective_cache_size per tablespace *exactly* for the
reason you explain.

--
Cédric Villemain
Administrateur de Base de Données
Cel: +33 (0)6 74 15 56 53
http://dalibo.com - http://dalibo.org

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, Cédric Villemain <cedric(dot)villemain(at)dalibo(dot)com>, pgsql-hackers(at)postgresql(dot)org, marcin mank <marcin(dot)mank(at)gmail(dot)com>
Subject:	Re: per table random-page-cost?
Date:	2009-11-09 23:56:29
Message-ID:	200911092356.nA9NuTn26523@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas wrote:
> On Thu, Oct 22, 2009 at 2:28 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> > All,
> >
> > Wouldn't per *tablespace* costs make more sense?
> >
> > --Josh
>
> Yes, we already had several votes in favor of that approach. See upthread.

Added to TODO:

Allow per-tablespace random_page_cost

* http://archives.postgresql.org/pgsql-hackers/2009-10/msg01128.php

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +