Re: Fix pgstatindex using for large indexes

Lists: pgsql-patches
From: Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-patches(at)postgresql(dot)org
Cc: "kasahara(OSSC)" <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>
Subject: Fix pgstatindex using for large indexes
Date: 2008-02-21 13:45:55
Message-ID: 47BD8093.7060105@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Hi.

In pgstatindex.c and pgstattuple.sql, some variables are defined with
int type. So when we try to get informations about a large index by using
pgstatindex, we get strange value of size and density.
Because the values exceed int-max.
# Like following output. I used pgstatindex just after data load.
So "density" is should be nearly 90.

test=# SELECT * FROM pgstatindex('large_index');
-[ RECORD 1 ]------+------------
version | 2
tree_level | 4
index_size | -1349410816 ★
root_block_no | 119666
internal_pages | 28936
leaf_pages | 1379204
empty_pages | 0
deleted_pages | 0
avg_leaf_density | 60.33 ★
leaf_fragmentation | 0

I think that max_avail and free_space should be uint64.
And the output format for index_size should be "%lld" (INT64_FORMAT).

I made the patch and tryed it. (And it seemed OK.)

test=# SELECT * FROM pgstatindex('large_index');
-[ RECORD 1 ]------+------------
version | 2
tree_level | 4
index_size | 11535491072
root_block_no | 119666
internal_pages | 28936
leaf_pages | 1379204
empty_pages | 0
deleted_pages | 0
avg_leaf_density | 90.64
leaf_fragmentation | 0

I also fix *_pages variables just in case.
Please confirm this.

Best regards.

--
NTT OSS Center
Tatsuhito Kasahara

kasahara.tatsuhito _at_ oss.ntt.co.jp

Attachment Content-Type Size
pgstatindex.patch text/plain 4.8 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Fix pgstatindex using for large indexes
Date: 2008-02-21 17:38:54
Message-ID: 12407.1203615534@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp> writes:
> In pgstatindex.c and pgstattuple.sql, some variables are defined with
> int type. So when we try to get informations about a large index by using
> pgstatindex, we get strange value of size and density.
> Because the values exceed int-max.
> ...
> I think that max_avail and free_space should be uint64.

Most places where we've dealt with this before, we use double, which is
guaranteed to be available whereas uint64 is not ...

regards, tom lane


From: Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org, "kasahara(OSSC)" <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>
Subject: Re: Fix pgstatindex using for large indexes
Date: 2008-02-24 11:21:28
Message-ID: 47C15338.1080807@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Hi.

Tom Lane wrote:
>> I think that max_avail and free_space should be uint64.
> Most places where we've dealt with this before, we use double, which is
> guaranteed to be available whereas uint64 is not ...
Oh I see.

I fix the patch.
# I changed "max_avail" and "free_space" to double.

Best regards.

--
NTT OSS Center
Tatsuhito Kasahara

kasahara.tatsuhito _at_ oss.ntt.co.jp

Attachment Content-Type Size
pgstatindex.patch text/plain 3.2 KB

From: Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Fix pgstatindex using for large indexes
Date: 2008-02-24 11:53:14
Message-ID: 47C15AAA.3040702@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Tatsuhito Kasahara wrote:
> I fix the patch.
Oops, I forgot to attach the patch for pgstattuple.sql.
I send it again.

Best regards.

--
NTT OSS Center
Tatsuhito Kasahara

kasahara.tatsuhito _at_ oss.ntt.co.jp

Attachment Content-Type Size
pgstatindex_v2.patch text/plain 4.5 KB

From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Fix pgstatindex using for large indexes
Date: 2008-02-25 15:26:19
Message-ID: 47C2DE1B.1050703@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Tom Lane napsal(a):
> Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp> writes:
>> In pgstatindex.c and pgstattuple.sql, some variables are defined with
>> int type. So when we try to get informations about a large index by using
>> pgstatindex, we get strange value of size and density.
>> Because the values exceed int-max.
>> ...
>> I think that max_avail and free_space should be uint64.
>
> Most places where we've dealt with this before, we use double, which is
> guaranteed to be available whereas uint64 is not ...

Is this requirement still valid? Is there any currently supported platform which
does not have uint64? IIRC we are going to change datetime to integer for 8.4.
By my opinion uint64 is suitable for head and probably for 8.2 and 8.3 as well.

64bit integer is already used on many places: e.g.

./commands/copy.c
./tcop/utility.c.
./optimizer/plan/planner.c

Zdenek


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Fix pgstatindex using for large indexes
Date: 2008-02-25 16:21:18
Message-ID: 27782.1203956478@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
> Tom Lane napsal(a):
>> Most places where we've dealt with this before, we use double, which is
>> guaranteed to be available whereas uint64 is not ...

> Is this requirement still valid?

Yes.

> Is there any currently supported platform which
> does not have uint64?

I don't know, and neither do you.

> IIRC we are going to change datetime to integer for 8.4.

We are going to change the *default* to integer. It will still be
possible to select float datetimes for use on platforms without working
int64. There is not any core functionality that will fail completely
without int64, and none will be introduced anytime soon if I have
anything to say about it.

Now having said that, there are places that use int64 with the
understanding that it might degrade to int32, and that that will be
good enough --- the statistics counters are an example. However,
the only point of the patch being presented for pgstatindex is to
be able to count higher than 2^32, so ISTM we may as well make it
use double. There isn't any particular reason it *shouldn't* be
coded that way, AFAICS, and there is precedent in that VACUUM/ANALYZE
use doubles for similar purposes.

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Fix pgstatindex using for large indexes
Date: 2008-02-25 16:41:44
Message-ID: 47C2EFC8.7090505@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Tom Lane wrote:
>
>> Is there any currently supported platform which
>> does not have uint64?
>>
>
> I don't know, and neither do you.
>
>
>

Maybe we should look at some reasonable way of getting the info out of a
compiled instance. How about if we get pg_config to output the value of
INT64_IS_BUSTED? Then we could add a step to the buildfarm client to
catch all the output from pg_config.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Fix pgstatindex using for large indexes
Date: 2008-02-25 16:50:11
Message-ID: 28246.1203958211@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Tom Lane wrote:
>>> Is there any currently supported platform which
>>> does not have uint64?
>>
>> I don't know, and neither do you.

> Maybe we should look at some reasonable way of getting the info out of a
> compiled instance. How about if we get pg_config to output the value of
> INT64_IS_BUSTED?

We know all the buildfarm machines have working int64, because they'd
fail the bigint regression test if not. That's not the point here.

regards, tom lane


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Fix pgstatindex using for large indexes
Date: 2008-02-25 17:21:31
Message-ID: 20080225092131.1c11ddea@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, 25 Feb 2008 11:21:18 -0500
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> > IIRC we are going to change datetime to integer for 8.4.
>
> We are going to change the *default* to integer.

Thank goodness. Now I can stop recompiling rpms. Thanks for this
- -hackers.

Joshua D. Drake

- --
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL SPI Liaison | SPI Director | PostgreSQL political pundit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHwvkdATb/zqfZUUQRAnl8AJ4ou850ROztMxHZyGeUaD/uXQRMPACeMN9x
72FC2K3hKKV/Aq+FPWI8UJ4=
=EeIA
-----END PGP SIGNATURE-----


From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Fix pgstatindex using for large indexes
Date: 2008-02-25 23:20:12
Message-ID: 47C34D2C.6090605@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Tom Lane wrote:
> Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
>> Tom Lane napsal(a):
>>> Most places where we've dealt with this before, we use double, which is
>>> guaranteed to be available whereas uint64 is not ...
>
>> Is this requirement still valid?
>
> Yes.
Maybe we should just bite the bullet, and implement int64 emulation
for platforms that don't provide one? I was thinking that something
like "typedef struct { int32 low, int32 high } int64", plus a few
Macros for basic arithmetic should do the trick.

Not particularly rewarding work, given that all major platforms *do*
support int64 - but it'd prevent the discussion that starts everytime
someone proposes a patch that depends on int64.

regards, Florian Pflug


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Fix pgstatindex using for large indexes
Date: 2008-02-26 00:51:22
Message-ID: 17226.1203987082@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

"Florian G. Pflug" <fgp(at)phlo(dot)org> writes:
> Maybe we should just bite the bullet, and implement int64 emulation
> for platforms that don't provide one?

Why? Workarounds such as "use double where needed" have served us
perfectly fine so far, with far less effort and notational ugliness
than this would involve.

There will come a time where either there's a really good reason to rely
on int64, or we feel that it's moot because any platform without int64
is certainly dead anyway. I'm not sure how far off that time is, but
it's probably some fairly small number of years. My position is simply
that pgstattuple does not present a reason to make that decision today,
especially not when making it rely on int64 is at variance with the
coding method already in use in related parts of the core backend.

regards, tom lane


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>, Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Fix pgstatindex using for large indexes
Date: 2008-02-26 09:24:08
Message-ID: 47C3DAB8.8000003@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Tom Lane napsal(a):
> "Florian G. Pflug" <fgp(at)phlo(dot)org> writes:
>> Maybe we should just bite the bullet, and implement int64 emulation
>> for platforms that don't provide one?
>
> Why? Workarounds such as "use double where needed" have served us
> perfectly fine so far, with far less effort and notational ugliness
> than this would involve.

Disadvantage of double usage is in CPUs. Current CPUs are perfectly
optimized for int operation but there are limitation for floating point
operatim. For example T1 (niagara) has only one FP unit per whole CPU
(T2 has one FP per core) and AMD64 has three parallel execution unit for
microcode but only one for FP and FP shares registry with MMX unit.
And operation system has more work with saving FP register and so on...
Not all these limitations are related to PostgreSQL but it is good to know.

By my opinion avoid double is important in critical(bottleneck) places.
How you mentioned double is OK for pgstattuple.

> There will come a time where either there's a really good reason to rely
> on int64, or we feel that it's moot because any platform without int64
> is certainly dead anyway. I'm not sure how far off that time is, but
> it's probably some fairly small number of years.

I don't think support more than five years older platform makes sense
for new version of PostgreSQL. Very often users buy a new hardware for
new database. IIRC all platform (x86, SPARC, MIPS, ALPHA, PARISC ...) do
not have problem with 64bit for longer time.

Zdenek


From: David Fetter <david(at)fetter(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Fix pgstatindex using for large indexes
Date: 2008-03-01 16:07:26
Message-ID: 20080301160726.GC16327@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Mon, Feb 25, 2008 at 11:50:11AM -0500, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> > Tom Lane wrote:
> >>> Is there any currently supported platform which does not have
> >>> uint64?
> >>
> >> I don't know, and neither do you.
>
> > Maybe we should look at some reasonable way of getting the info
> > out of a compiled instance. How about if we get pg_config to
> > output the value of INT64_IS_BUSTED?
>
> We know all the buildfarm machines have working int64, because
> they'd fail the bigint regression test if not. That's not the point
> here.

If we don't have buildfarm coverage for machines where INT64_IS_BUSTED,
how do we know we support those architectures at all?

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Fix pgstatindex using for large indexes
Date: 2008-03-04 03:30:26
Message-ID: 200803040330.m243UQE00337@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches


Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------

Tatsuhito Kasahara wrote:
> Hi.
>
> Tom Lane wrote:
> >> I think that max_avail and free_space should be uint64.
> > Most places where we've dealt with this before, we use double, which is
> > guaranteed to be available whereas uint64 is not ...
> Oh I see.
>
> I fix the patch.
> # I changed "max_avail" and "free_space" to double.
>
> Best regards.
>
> --
> NTT OSS Center
> Tatsuhito Kasahara
>
> kasahara.tatsuhito _at_ oss.ntt.co.jp
>
>

> *** postgresql-8.3.0.org/contrib/pgstattuple/pgstatindex.c 2007-11-16 06:14:31.000000000 +0900
> --- postgresql-8.3.0/contrib/pgstattuple/pgstatindex.c 2008-02-24 19:35:09.000000000 +0900
> ***************
> *** 68,75 ****
> uint32 empty_pages;
> uint32 deleted_pages;
>
> ! uint32 max_avail;
> ! uint32 free_space;
>
> uint32 fragments;
> } BTIndexStat;
> --- 68,75 ----
> uint32 empty_pages;
> uint32 deleted_pages;
>
> ! double max_avail;
> ! double free_space;
>
> uint32 fragments;
> } BTIndexStat;
> ***************
> *** 87,94 ****
> Relation rel;
> RangeVar *relrv;
> Datum result;
> ! uint32 nblocks;
> ! uint32 blkno;
> BTIndexStat indexStat;
>
> if (!superuser())
> --- 87,94 ----
> Relation rel;
> RangeVar *relrv;
> Datum result;
> ! BlockNumber nblocks;
> ! BlockNumber blkno;
> BTIndexStat indexStat;
>
> if (!superuser())
> ***************
> *** 207,231 ****
> values[j] = palloc(32);
> snprintf(values[j++], 32, "%d", indexStat.level);
> values[j] = palloc(32);
> ! snprintf(values[j++], 32, "%d", (indexStat.root_pages +
> ! indexStat.leaf_pages +
> ! indexStat.internal_pages +
> ! indexStat.deleted_pages +
> ! indexStat.empty_pages) * BLCKSZ);
> values[j] = palloc(32);
> snprintf(values[j++], 32, "%d", indexStat.root_blkno);
> values[j] = palloc(32);
> ! snprintf(values[j++], 32, "%d", indexStat.internal_pages);
> values[j] = palloc(32);
> ! snprintf(values[j++], 32, "%d", indexStat.leaf_pages);
> values[j] = palloc(32);
> ! snprintf(values[j++], 32, "%d", indexStat.empty_pages);
> values[j] = palloc(32);
> ! snprintf(values[j++], 32, "%d", indexStat.deleted_pages);
> values[j] = palloc(32);
> ! snprintf(values[j++], 32, "%.2f", 100.0 - (float) indexStat.free_space / (float) indexStat.max_avail * 100.0);
> values[j] = palloc(32);
> ! snprintf(values[j++], 32, "%.2f", (float) indexStat.fragments / (float) indexStat.leaf_pages * 100.0);
>
> tuple = BuildTupleFromCStrings(TupleDescGetAttInMetadata(tupleDesc),
> values);
> --- 207,231 ----
> values[j] = palloc(32);
> snprintf(values[j++], 32, "%d", indexStat.level);
> values[j] = palloc(32);
> ! snprintf(values[j++], 32, "%.0f", ( (double) indexStat.root_pages +
> ! (double) indexStat.leaf_pages +
> ! (double) indexStat.internal_pages +
> ! (double) indexStat.deleted_pages +
> ! (double) indexStat.empty_pages) * BLCKSZ);
> values[j] = palloc(32);
> snprintf(values[j++], 32, "%d", indexStat.root_blkno);
> values[j] = palloc(32);
> ! snprintf(values[j++], 32, "%u", indexStat.internal_pages);
> values[j] = palloc(32);
> ! snprintf(values[j++], 32, "%u", indexStat.leaf_pages);
> values[j] = palloc(32);
> ! snprintf(values[j++], 32, "%u", indexStat.empty_pages);
> values[j] = palloc(32);
> ! snprintf(values[j++], 32, "%u", indexStat.deleted_pages);
> values[j] = palloc(32);
> ! snprintf(values[j++], 32, "%.2f", 100.0 - indexStat.free_space / indexStat.max_avail * 100.0);
> values[j] = palloc(32);
> ! snprintf(values[j++], 32, "%.2f", (double) indexStat.fragments / (double) indexStat.leaf_pages * 100.0);
>
> tuple = BuildTupleFromCStrings(TupleDescGetAttInMetadata(tupleDesc),
> values);
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Fix pgstatindex using for large indexes
Date: 2008-03-21 02:41:10
Message-ID: 20080321024110.GA13594@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Tatsuhito Kasahara wrote:
> Tatsuhito Kasahara wrote:
> > I fix the patch.
> Oops, I forgot to attach the patch for pgstattuple.sql.
> I send it again.

Hmm, this followup patch is wrong though -- the SQL definition is still
using BIGINT where it should be using double. And the other changes to
use BIGINT where the original values were int4 seem unnecessary.

One thing I'm not clear about is the change from %d to %u to represent
int4 values. Since the SQL datatype is signed, this can't really work,
now, can it?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Fix pgstatindex using for large indexes
Date: 2008-03-21 02:45:36
Message-ID: 11677.1206067536@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp> writes:
> Tom Lane wrote:
>> Most places where we've dealt with this before, we use double, which is
>> guaranteed to be available whereas uint64 is not ...

> Oh I see.

> I fix the patch.
> # I changed "max_avail" and "free_space" to double.

I took a closer look at this, and noticed that you were still redefining
the output parameters of the function as BIGINT:

***************
*** 33,48 ****
-- pgstatindex
--
CREATE OR REPLACE FUNCTION pgstatindex(IN relname text,
! OUT version int4,
! OUT tree_level int4,
! OUT index_size int4,
! OUT root_block_no int4,
! OUT internal_pages int4,
! OUT leaf_pages int4,
! OUT empty_pages int4,
! OUT deleted_pages int4,
! OUT avg_leaf_density float8,
! OUT leaf_fragmentation float8)
AS 'MODULE_PATHNAME', 'pgstatindex'
LANGUAGE C STRICT;

--- 33,48 ----
-- pgstatindex
--
CREATE OR REPLACE FUNCTION pgstatindex(IN relname text,
! OUT version INT,
! OUT tree_level INT,
! OUT index_size BIGINT,
! OUT root_block_no INT,
! OUT internal_pages BIGINT,
! OUT leaf_pages BIGINT,
! OUT empty_pages BIGINT,
! OUT deleted_pages BIGINT,
! OUT avg_leaf_density FLOAT8,
! OUT leaf_fragmentation FLOAT8)
AS 'MODULE_PATHNAME', 'pgstatindex'
LANGUAGE C STRICT;

This is inconsistent --- if int64 isn't actually available, it's not
likely to work very well. It seems to me that we should either change
the output parameters to float8, or stick with the original version of
the patch and just accept that it will give overflowed answers on
machines without working int64.

Given that there is no problem until you get into multi-terabyte
indexes, which are hardly likely to be getting pushed around on ancient
systems, it's hard to argue that there's really any case against using
bigint. Also I now see that pgstattuple() is using bigint for numbers
that are really much more likely to overflow a broken int64 than
pgstatindex() is, so the argument for float would require us to change
both functions.

In short, I'm willing to drop my opposition to the original form
of the patch.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Fix pgstatindex using for large indexes
Date: 2008-03-21 03:11:39
Message-ID: 12185.1206069099@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Hmm, this followup patch is wrong though -- the SQL definition is still
> using BIGINT where it should be using double. And the other changes to
> use BIGINT where the original values were int4 seem unnecessary.

I'm on this one now ...

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tatsuhito Kasahara <kasahara(dot)tatsuhito(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Fix pgstatindex using for large indexes
Date: 2008-03-21 03:24:45
Message-ID: 12453.1206069885@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

I wrote:
> In short, I'm willing to drop my opposition to the original form
> of the patch.

Original version applied with some minor tweaks (notably, pg_relpages
had the same problem).

regards, tom lane