Re: TODO-Item: B-tree fillfactor control

Lists: pgsql-hackerspgsql-patches
From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: TODO-Item: B-tree fillfactor control
Date: 2006-01-19 02:00:00
Message-ID: 20060119095341.4DC7.ITAGAKI.TAKAHIRO@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Hi Hackers,

I'm trying the following TODO item:
[Indexes]
- Add fillfactor to control reserved free space during index creation

I have already made an patch and it seemed to work well.

----
$ ./pgbench -i -s 10
# select relpages from pg_class where relname = 'accounts_pkey';
relpages | 2745 ( default PCTFree is 10% )

# set btree_leaf_free_percent = 0;
# reindex index accounts_pkey;
# select relpages from pg_class where relname = 'accounts_pkey';
relpages | 2475 ( <- about 2745 * 0.9 = 2470.5 )

# set btree_leaf_free_percent = 30;
# reindex index accounts_pkey;
# select relpages from pg_class where relname = 'accounts_pkey';
relpages | 3537 ( <- about 2745 * 0.9 / 0.7 = 3529.3 )
----

And now, I need advice on some issues.

- Is it appropriate to use GUC variables to control fillfactors?
Is it better to extend CREATE INDEX / REINDEX grammar?
- Should indexes remember their fillfactors when they are created?
The last fillfactors will be used on next reindex.
- Is fillfactor useful for hash and gist indexes?
I think hash does not need it, but gist might need it.

Look forward to your comments.
Thanks,

---
ITAGAKI Takahiro
NTT Cyber Space Laboratories

Attachment Content-Type Size
btree_free_percent.patch application/octet-stream 2.6 KB

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TODO-Item: B-tree fillfactor control
Date: 2006-02-01 17:45:11
Message-ID: 200602011745.k11HjBn02927@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

ITAGAKI Takahiro wrote:
> Hi Hackers,
>
> I'm trying the following TODO item:
> [Indexes]
> - Add fillfactor to control reserved free space during index creation
>
> I have already made an patch and it seemed to work well.

Great.

> And now, I need advice on some issues.
>
> - Is it appropriate to use GUC variables to control fillfactors?
> Is it better to extend CREATE INDEX / REINDEX grammar?

I think it has to be part of CREATE INDEX and ALTER INDEX.

Is there a use for separate node and leaf settings?

This patch needs documentation, and if we have separate items, we are
going to have to explain when to use node or leaf.

> - Should indexes remember their fillfactors when they are created?
> The last fillfactors will be used on next reindex.

They should remember, for sure, and REINDEX should use it. It think
this is similar to the ALTER TABLE ALTER [ COLUMN ] ... SET STATISTICS
functionality. It will need to be dumped as well by pg_dump. If you
need help with any of this, let me know.

> - Is fillfactor useful for hash and gist indexes?
> I think hash does not need it, but gist might need it.

Not sure. We don't know what type of index a GIST will be so we have no
way of knowing. I am thinking we can implement just btree now and the
GIST folks can add it later if they want. My guess is that each GIST is
going to behave differently for different fill-factors, so if allow it
to be set for GIST, GIST developers can pull the value if they want.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: TODO-Item: B-tree fillfactor control
Date: 2006-02-01 18:22:17
Message-ID: 3185.1138818137@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
>> - Should indexes remember their fillfactors when they are created?
>> The last fillfactors will be used on next reindex.

> They should remember, for sure, and REINDEX should use it. It think
> this is similar to the ALTER TABLE ALTER [ COLUMN ] ... SET STATISTICS
> functionality. It will need to be dumped as well by pg_dump.

If you want it to be dumped by pg_dump (which is debatable IMHO) then
it MUST NOT be a syntax extension, it has to be driven by a GUC
variable, else we have compatibility problems with the dumps. We just
went through this with WITH/WITHOUT OIDS.

regards, tom lane


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: TODO-Item: B-tree fillfactor control
Date: 2006-02-01 18:32:24
Message-ID: 200602011832.k11IWOM10789@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> >> - Should indexes remember their fillfactors when they are created?
> >> The last fillfactors will be used on next reindex.
>
> > They should remember, for sure, and REINDEX should use it. It think
> > this is similar to the ALTER TABLE ALTER [ COLUMN ] ... SET STATISTICS
> > functionality. It will need to be dumped as well by pg_dump.
>
> If you want it to be dumped by pg_dump (which is debatable IMHO) then
> it MUST NOT be a syntax extension, it has to be driven by a GUC
> variable, else we have compatibility problems with the dumps. We just
> went through this with WITH/WITHOUT OIDS.

OK, so we are going to issue a GUC to set the fill factor in pg_dump,
but still have a fillfactor syntax for use by users? That is how we do
WITH/WITHOUT OIDS.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: TODO-Item: B-tree fillfactor control
Date: 2006-02-02 01:16:43
Message-ID: 43E15D7B.6020903@familyhealth.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

> If you want it to be dumped by pg_dump (which is debatable IMHO) then
> it MUST NOT be a syntax extension, it has to be driven by a GUC
> variable, else we have compatibility problems with the dumps. We just
> went through this with WITH/WITHOUT OIDS.

Compatibility problems? CREATE INDEX isn't an SQL standard command is it?

Chris


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: TODO-Item: B-tree fillfactor control
Date: 2006-02-02 01:59:37
Message-ID: 7314.1138845577@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au> writes:
> Compatibility problems? CREATE INDEX isn't an SQL standard command is it?

No, but it'll cause unnecessary cross-version compatibility issues for
us.

regards, tom lane


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: TODO-Item: B-tree fillfactor control
Date: 2006-02-02 02:24:19
Message-ID: 200602020224.k122OJn22191@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au> writes:
> > Compatibility problems? CREATE INDEX isn't an SQL standard command is it?
>
> No, but it'll cause unnecessary cross-version compatibility issues for
> us.

It is true it isn't SQL standard, but I think our CREATE INDEX syntax
matches many vendor's syntax in most cases.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TODO-Item: B-tree fillfactor control
Date: 2006-02-02 08:19:43
Message-ID: 20060202163110.4A8B.ITAGAKI.TAKAHIRO@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> wrote:

> > - Is fillfactor useful for hash and gist indexes?
> > I think hash does not need it, but gist might need it.
>
> Not sure. We don't know what type of index a GIST will be so we have no
> way of knowing. I am thinking we can implement just btree now and the
> GIST folks can add it later if they want. My guess is that each GIST is
> going to behave differently for different fill-factors, so if allow it
> to be set for GIST, GIST developers can pull the value if they want.

My understanding about hash was wrong. It uses fill factor of 75%, which is
hard-coded. On the other hand, GIST has no ability to control fill factor
currently. I'm trying to add fill factors to hash and gist, so I'll ask
index developers to review a patch in the future.

> > - Is it appropriate to use GUC variables to control fillfactors?
> > Is it better to extend CREATE INDEX / REINDEX grammar?
>
> I think it has to be part of CREATE INDEX and ALTER INDEX.

SQL standard has no regulation for indexes, so I refered to other databases.
- Oracle and DB2 : CREATE INDEX index ON table (...) PCTFREE 30;
- MS SQL Server : CREATE INDEX index ON table (...) WITH FILLFACTOR = 70;

PCTFREE seems to be common, so I'll extend DDL to use PCTFREE syntax.
The following two syntaxes will be able to be used.
1. SET btree_free_percent = 30;
CREATE INDEX index ON table (...);
SET btree_free_percent = 10; -- revert
2. CREATE INDEX index ON table (...) PCTFREE 30;

1 would be useful for a compatibe pg_dump format, per suggestion from Tom.

> Is there a use for separate node and leaf settings?

We should use different settings for leaf and node, but it may confuse users.
So I'll simplify the setting as follows:
node_free_percent = Min(30%, 3 * leaf_free_percent)
When leaf_free_percent is 10%, node_free_percent is 30%. They are the same
values of the current implementation.

---
ITAGAKI Takahiro
NTT Cyber Space Laboratories


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TODO-Item: B-tree fillfactor control
Date: 2006-02-02 17:47:43
Message-ID: 200602021747.k12Hlha29377@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

ITAGAKI Takahiro wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> wrote:
>
> > > - Is fillfactor useful for hash and gist indexes?
> > > I think hash does not need it, but gist might need it.
> >
> > Not sure. We don't know what type of index a GIST will be so we have no
> > way of knowing. I am thinking we can implement just btree now and the
> > GIST folks can add it later if they want. My guess is that each GIST is
> > going to behave differently for different fill-factors, so if allow it
> > to be set for GIST, GIST developers can pull the value if they want.
>
> My understanding about hash was wrong. It uses fill factor of 75%, which is
> hard-coded. On the other hand, GIST has no ability to control fill factor
> currently. I'm trying to add fill factors to hash and gist, so I'll ask
> index developers to review a patch in the future.

OK.

> > > - Is it appropriate to use GUC variables to control fillfactors?
> > > Is it better to extend CREATE INDEX / REINDEX grammar?
> >
> > I think it has to be part of CREATE INDEX and ALTER INDEX.
>
> SQL standard has no regulation for indexes, so I refered to other databases.
> - Oracle and DB2 : CREATE INDEX index ON table (...) PCTFREE 30;
> - MS SQL Server : CREATE INDEX index ON table (...) WITH FILLFACTOR = 70;
>
> PCTFREE seems to be common, so I'll extend DDL to use PCTFREE syntax.
> The following two syntaxes will be able to be used.
> 1. SET btree_free_percent = 30;
> CREATE INDEX index ON table (...);
> SET btree_free_percent = 10; -- revert
> 2. CREATE INDEX index ON table (...) PCTFREE 30;
>
> 1 would be useful for a compatibe pg_dump format, per suggestion from Tom.

I personally like FILLFACTOR, but I understand the desire to match
Oracle. PCTFREE seems too abreviated for me, but it would match the GUC
better, so maybe it is the best.

> > Is there a use for separate node and leaf settings?
>
> We should use different settings for leaf and node, but it may confuse users.
> So I'll simplify the setting as follows:
> node_free_percent = Min(30%, 3 * leaf_free_percent)
> When leaf_free_percent is 10%, node_free_percent is 30%. They are the same
> values of the current implementation.

Yes, I think that is ideal.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>
To: pgsql-patches(at)postgresql(dot)org
Subject: TODO-Item: B-tree fillfactor control
Date: 2006-02-06 04:27:21
Message-ID: 20060206120725.49B5.ITAGAKI.TAKAHIRO@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

This is a draft patch for index fillfactor control discussed in
http://archives.postgresql.org/pgsql-hackers/2006-02/msg00013.php

I added the following features:
- Add support for btree, hash and gist.
- Syntax extension using PCTFREE.
- Save settings to catalog. Next REINDEX will use the last value.

I'd like to ask index developers to review the patch, especially
the method to control fill factor for hash and gist.
I'll write documentations if there is no problem in the features.
Comments are welcome.

[Syntax extension]
- CREATE INDEX index ON table (columns) [ PCTFREE percent ];
- REINDEX { INDEX | TABLE | DATABASE | SYSTEM } name [ PCTFREE percent ];
- ALTER INDEX index SET PCTFREE percent;
- PRIMARY KEY, UNIQUE constraint
CREATE TABLE / ALTER TABLE table ADD PRIMARY KEY [ PCTFREE percent ]
- with GUC
SET btree_free_percent = 30;
CREATE INDEX index ON table (...);
SET btree_free_percent = 10; -- revert

[Test and Result]
# CREATE table test1 (i int);
# INSERT INTO test1 SELECT generate_series(1, 100000);
# CREATE table test2 (c circle);
# INSERT INTO test2
# SELECT circle(point(100 * random(), 100 * random()), random())
# from generate_series(1, 100000);

# CREATE INDEX idx1_btree_0 ON test1 USING btree (i) PCTFREE 0;
# CREATE INDEX idx1_btree_10 ON test1 USING btree (i) PCTFREE 10;
# CREATE INDEX idx1_btree_30 ON test1 USING btree (i) PCTFREE 30;
# CREATE INDEX idx1_hash_0 ON test1 USING hash (i) PCTFREE 0;
# CREATE INDEX idx1_hash_25 ON test1 USING hash (i) PCTFREE 25;
# CREATE INDEX idx1_hash_40 ON test1 USING hash (i) PCTFREE 40;
# CREATE INDEX idx2_gist_0 ON test2 USING gist (c) PCTFREE 0;
# CREATE INDEX idx2_gist_10 ON test2 USING gist (c) PCTFREE 10;
# CREATE INDEX idx2_gist_30 ON test2 USING gist (c) PCTFREE 30;

# SELECT relname, relpages from pg_class where relname LIKE 'idx%' ORDER BY relname;
relname | relpages
---------------+----------
idx1_btree_0 | 249
idx1_btree_10 | 276 -- 249 / 0.9 = 277
idx1_btree_30 | 357 -- 249 / 0.7 = 356
idx1_hash_0 | 375
idx1_hash_25 | 413 -- Hash is not linear against fill factors.
idx1_hash_40 | 453 --
idx2_gist_0 | 882
idx2_gist_10 | 977 -- 882 / 0.9 = 980
idx2_gist_30 | 1273 -- 882 / 0.7 = 1260
(9 rows)

---
ITAGAKI Takahiro
NTT Cyber Space Laboratories

Attachment Content-Type Size
index_free_percent-0206.patch application/octet-stream 67.7 KB

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: TODO-Item: B-tree fillfactor control
Date: 2006-02-07 21:44:27
Message-ID: 1139348667.1258.163.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Mon, 2006-02-06 at 13:27 +0900, ITAGAKI Takahiro wrote:
> This is a draft patch for index fillfactor control discussed in
> http://archives.postgresql.org/pgsql-hackers/2006-02/msg00013.php
>
> I added the following features:
> - Add support for btree, hash and gist.
> - Syntax extension using PCTFREE.
> - Save settings to catalog. Next REINDEX will use the last value.
>
> I'd like to ask index developers to review the patch, especially
> the method to control fill factor for hash and gist.
> I'll write documentations if there is no problem in the features.
> Comments are welcome.

Looks pretty complete to me. A useful patch for large databases.

Do you have any performance numbers for the extreme settings? It may be
worth having different max limits for each of the index types, since
they differ so widely in algorithms. Do we have any tests to show
whether 3*setting is the right value for b-tree node pages? It sounds
about right but I have no evidence either way.

I'm surprised that you do not use the parameter to control the RIGHTMOST
index block split factor for B-trees, which remains at a constant 67%.
The PCTFREE only seems to apply at CREATE INDEX time.

"The steady-state load factor for btrees is usually estimated at 70%."
but we recognise that estimate as being from the 1980s and not
necessarily reflecting all application types for which we now use
databases.

Can we use the PCTFREE setting to control the RIGHTMOST behaviour? If I
manually control the PCTFREE I want it to work like that all of the
time, not just some of the time.

[i.e. with this patch if I fill an index with 1000 blocks of data using
PCTFREE 0 the index will use 1000 blocks. If I COPY another 1000 blocks
of data the index would then be 1500 blocks larger, 2500 total. The
current cvstip acts thus: if I fill an index with 1000 blocks of data
the index will use 1111 blocks. If I COPY another 1000 blocks of data
the index would then be 1500 blocks larger, 2611 total. I'd like to be
able to have the index use only 2000 blocks when PCTFREE=0 - if I ask
for fully packed I want fully packed, please]

If we support PCTFREE for compatibility reasons should we not also
support the alternative FILLFACTOR syntax also? I see no reason to
favour Oracle/DB2 compatability at the expense of SQLServer
compatibility.

Best Regards, Simon Riggs


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org
Subject: Re: TODO-Item: B-tree fillfactor control
Date: 2006-02-07 22:09:30
Message-ID: 843.1139350170@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> If we support PCTFREE for compatibility reasons should we not also
> support the alternative FILLFACTOR syntax also? I see no reason to
> favour Oracle/DB2 compatability at the expense of SQLServer
> compatibility.

One nonstandard syntax is more than enough.

regards, tom lane


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>
To: pgsql-patches(at)postgresql(dot)org, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: TODO-Item: B-tree fillfactor control
Date: 2006-02-10 10:12:48
Message-ID: 20060210170213.48E1.ITAGAKI.TAKAHIRO@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

This is a revised patch for index fillfactor control:
- Split MAX_PCTFREE into three for each index method.
- B-tree indexes use their own settings when rightmost page is split.
- Fix a bug that GUC is modified when index building is canceled.
- Add some documentations.

Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

> Do you have any performance numbers for the extreme settings? It may be
> worth having different max limits for each of the index types, since
> they differ so widely in algorithms.

Different max limits are done.
I worry about whether index works properly on high PCTFREE settings. I found
hash has its own sanity checking, but I don't know other indexes have.

> I'm surprised that you do not use the parameter to control the RIGHTMOST
> index block split factor for B-trees, which remains at a constant 67%.
> The PCTFREE only seems to apply at CREATE INDEX time.

Thanks for pointing out. I did not inadvertently use fillfactor on
the rightmost page. With the revised patch, PCTFREE will be considered
in such cases.

# CREATE TABLE test (i int);
# INSERT INTO test SELECT generate_series(1, 100000);
# CREATE INDEX btree ON test USING btree (i) PCTFREE 0;
# SELECT relpages from pg_class where relname ='btree';
relpages | 249
# INSERT INTO test SELECT generate_series(100001, 200000);
# SELECT relpages from pg_class where relname ='btree';
relpages | 497 <-- +99.6%

But default settings will change. Is this ok?

| | patched |
| now | free=10 | free=0 |
-----------------+-----+---------+--------+-
leaf (REINDEX) | 10 | 10 | 0 |
leaf (RIGHTMOST) | 30 | 10 | 0 | = leaf
node (REINDEX) | 30 | 30 | 0 | = 3*leaf

> If we support PCTFREE for compatibility reasons should we not also
> support the alternative FILLFACTOR syntax also? I see no reason to
> favour Oracle/DB2 compatability at the expense of SQLServer
> compatibility.

There are few synonyms in PostgreSQL, so I think it is better for us to
adopt only either one. I like FILLFACTOR personally, but compatibility
with Oracle is more important to users around me.

---
ITAGAKI Takahiro
NTT Cyber Space Laboratories

Attachment Content-Type Size
index_free_percent-0210.patch application/octet-stream 71.5 KB

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: TODO-Item: B-tree fillfactor control
Date: 2006-02-10 13:39:52
Message-ID: 1139578792.1258.474.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Fri, 2006-02-10 at 19:12 +0900, ITAGAKI Takahiro wrote:
> This is a revised patch for index fillfactor control:
> - Split MAX_PCTFREE into three for each index method.
> - B-tree indexes use their own settings when rightmost page is split.
> - Fix a bug that GUC is modified when index building is canceled.
> - Add some documentations.

> Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> > Do you have any performance numbers for the extreme settings? It may be
> > worth having different max limits for each of the index types, since
> > they differ so widely in algorithms.
>
> Different max limits are done.
> I worry about whether index works properly on high PCTFREE settings. I found
> hash has its own sanity checking, but I don't know other indexes have.

Thanks.

> > I'm surprised that you do not use the parameter to control the RIGHTMOST
> > index block split factor for B-trees, which remains at a constant 67%.
> > The PCTFREE only seems to apply at CREATE INDEX time.
>
> Thanks for pointing out. I did not inadvertently use fillfactor on
> the rightmost page. With the revised patch, PCTFREE will be considered
> in such cases.
>
> # CREATE TABLE test (i int);
> # INSERT INTO test SELECT generate_series(1, 100000);
> # CREATE INDEX btree ON test USING btree (i) PCTFREE 0;
> # SELECT relpages from pg_class where relname ='btree';
> relpages | 249
> # INSERT INTO test SELECT generate_series(100001, 200000);
> # SELECT relpages from pg_class where relname ='btree';
> relpages | 497 <-- +99.6%
>

This is great.

> But default settings will change. Is this ok?
>
> | | patched |
> | now | free=10 | free=0 |
> -----------------+-----+---------+--------+-
> leaf (REINDEX) | 10 | 10 | 0 |
> leaf (RIGHTMOST) | 30 | 10 | 0 | = leaf
> node (REINDEX) | 30 | 30 | 0 | = 3*leaf

I think thats appropriate; lets see what others think.

> > If we support PCTFREE for compatibility reasons should we not also
> > support the alternative FILLFACTOR syntax also? I see no reason to
> > favour Oracle/DB2 compatability at the expense of SQLServer
> > compatibility.
>
> There are few synonyms in PostgreSQL, so I think it is better for us to
> adopt only either one. I like FILLFACTOR personally, but compatibility
> with Oracle is more important to users around me.

OK, no probs.

Reading through rest of patch now.

Best Regards, Simon Riggs


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: TODO-Item: B-tree fillfactor control
Date: 2006-02-13 12:59:30
Message-ID: 1139835570.1258.761.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Fri, 2006-02-10 at 19:12 +0900, ITAGAKI Takahiro wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > I'm surprised that you do not use the parameter to control the RIGHTMOST
> > index block split factor for B-trees, which remains at a constant 67%.
> > The PCTFREE only seems to apply at CREATE INDEX time.
>
> Thanks for pointing out. I did not inadvertently use fillfactor on
> the rightmost page. With the revised patch, PCTFREE will be considered
> in such cases.
>
> # CREATE TABLE test (i int);
> # INSERT INTO test SELECT generate_series(1, 100000);
> # CREATE INDEX btree ON test USING btree (i) PCTFREE 0;
> # SELECT relpages from pg_class where relname ='btree';
> relpages | 249
> # INSERT INTO test SELECT generate_series(100001, 200000);
> # SELECT relpages from pg_class where relname ='btree';
> relpages | 497 <-- +99.6%
>

This additional functionality looks like it would work for b-trees. I've
not looked at this for GIST and hash indexes.

The reduction in index size should give useful performance gains on
larger, growing tables with increasing keys. We'll test that.

Best Regards, Simon Riggs