Re: the number of pending entries in GIN index with FASTUPDATE=on

Lists: pgsql-hackers
From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: the number of pending entries in GIN index with FASTUPDATE=on
Date: 2012-11-02 16:48:56
Message-ID: CAHGQGwEvX=iK210jpTjZEXeWgTFR6k3NmmMoezAByTLbAZ6eqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Is there the way to know the number of pending entries in GIN index which
was created with FASTUPDATE = on? If not, is it worth implementing the
function returning that number?

I sometimes would like to know that number when I measure how much
pending entries affect the performance of GIN index scan and tune how
frequently autovacuum should run VACUUM to clean up them.

Regards,

--
Fujii Masao


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: the number of pending entries in GIN index with FASTUPDATE=on
Date: 2012-11-06 16:37:29
Message-ID: CA+Tgmob1SfvFd6H_Hk1YepOOsrece7Fa-KkjaWqXaRaX4stu3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Nov 2, 2012 at 12:48 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> Is there the way to know the number of pending entries in GIN index which
> was created with FASTUPDATE = on? If not, is it worth implementing the
> function returning that number?
>
> I sometimes would like to know that number when I measure how much
> pending entries affect the performance of GIN index scan and tune how
> frequently autovacuum should run VACUUM to clean up them.

Seems useful to me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: the number of pending entries in GIN index with FASTUPDATE=on
Date: 2012-11-06 18:01:30
Message-ID: 16271.1352224890@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Fri, Nov 2, 2012 at 12:48 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> Is there the way to know the number of pending entries in GIN index which
>> was created with FASTUPDATE = on? If not, is it worth implementing the
>> function returning that number?

> Seems useful to me.

Seems like the appropriate place to expose this would be in a
GIN-specific variant of contrib/pgstattuple's pgstatindex(). I seem to
recall some previous discussion about how to fix the btree-centricity
of that function's API, but I don't recall if we thought of a good
solution.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: the number of pending entries in GIN index with FASTUPDATE=on
Date: 2012-11-06 19:34:53
Message-ID: CA+TgmoaB1zNqQUrODYVzhc=oENrm6jDqZrECeU2zdD=138=-=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 6, 2012 at 1:01 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Fri, Nov 2, 2012 at 12:48 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>> Is there the way to know the number of pending entries in GIN index which
>>> was created with FASTUPDATE = on? If not, is it worth implementing the
>>> function returning that number?
>
>> Seems useful to me.
>
> Seems like the appropriate place to expose this would be in a
> GIN-specific variant of contrib/pgstattuple's pgstatindex().

Yeah, that seems good to me, too. Or something in pgstatindex, anyway.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: the number of pending entries in GIN index with FASTUPDATE=on
Date: 2012-11-08 18:42:44
Message-ID: CAHGQGwHqWG4Qu-UAZG5cYGXF0WB8e9yQBk9ThezR1YwCNPoCOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 7, 2012 at 4:34 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Nov 6, 2012 at 1:01 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> On Fri, Nov 2, 2012 at 12:48 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>>> Is there the way to know the number of pending entries in GIN index which
>>>> was created with FASTUPDATE = on? If not, is it worth implementing the
>>>> function returning that number?
>>
>>> Seems useful to me.
>>
>> Seems like the appropriate place to expose this would be in a
>> GIN-specific variant of contrib/pgstattuple's pgstatindex().
>
> Yeah, that seems good to me, too. Or something in pgstatindex, anyway.

Agreed. Attached patch introduces the pgstatginindex() which now reports
GIN version number, number of pages in the pending list and number of
tuples in the pending list, as information about a GIN index.

Regards,

--
Fujii Masao

Attachment Content-Type Size
pgstatginindex_v1.patch application/octet-stream 12.8 KB

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: masao(dot)fujii(at)gmail(dot)com
Cc: robertmhaas(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: the number of pending entries in GIN index with FASTUPDATE=on
Date: 2012-11-20 07:44:39
Message-ID: 20121120.164439.248518126.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

> Agreed. Attached patch introduces the pgstatginindex() which now reports
> GIN version number, number of pages in the pending list and number of
> tuples in the pending list, as information about a GIN index.

It seems fine on the whole, and I have some suggestions.

1. This patch applies current git head cleanly, but installation
ends up with failure because of the lack of
pgstattuple--1.0--1.1.sql which added in Makefile.

2. I feel somewhat uneasy with size for palloc's (it's too long),
and BuildTupleFromCString used instead of heap_from_tuple.. But
it would lead additional modification for existent simillars.

You can leave that if you prefer to keep this patch smaller,
but it looks to me more preferable to construct the result
tuple not via c-strings in some aspects. (*1)

3. pgstatginidex shows only version, pending_pages, and
pendingtuples. Why don't you show the other properties such as
entry pages, data pages, entries, and total pages as
pgstatindex does?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

(*1) Sample

diff --git a/contrib/pgstattuple/pgstatindex.c b/contrib/pgstattuple/pgstatindex.c
index 8a2ae85..71c2023 100644
--- a/contrib/pgstattuple/pgstatindex.c
+++ b/contrib/pgstattuple/pgstatindex.c
@@ -29,2 +29,3 @@

+#include "access/htup_details.h"
#include "access/gin_private.h"
@@ -39,3 +40,2 @@

-
extern Datum pgstatindex(PG_FUNCTION_ARGS);
@@ -330,4 +330,5 @@ pgstatginindex(PG_FUNCTION_ARGS)
int j;
- char *values[3];
+ Datum values[3];
Datum result;
+ bool nulls[3] = {false, false, false};

@@ -376,11 +377,6 @@ pgstatginindex(PG_FUNCTION_ARGS)
j = 0;
- values[j] = palloc(32);
- snprintf(values[j++], 32, "%d", stats.version);
- values[j] = palloc(32);
- snprintf(values[j++], 32, "%u", stats.pending_pages);
- values[j] = palloc(64);
- snprintf(values[j++], 64, INT64_FORMAT, stats.pending_tuples);
-
- tuple = BuildTupleFromCStrings(TupleDescGetAttInMetadata(tupleDesc),
- values);
+ values[j++] = Int32GetDatum(stats.version);
+ values[j++] = UInt32GetDatum(stats.pending_pages);
+ values[j++] = Int64GetDatumFast(stats.pending_tuples);
+ tuple = heap_form_tuple(tupleDesc, values, nulls);


From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: robertmhaas(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: the number of pending entries in GIN index with FASTUPDATE=on
Date: 2012-11-22 19:14:38
Message-ID: CAHGQGwHgjeROGxp8P0EbDV_9-xG9J2OUNOVhssXnetV2tFeZaQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 20, 2012 at 4:44 PM, Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> Hello,
>
>> Agreed. Attached patch introduces the pgstatginindex() which now reports
>> GIN version number, number of pages in the pending list and number of
>> tuples in the pending list, as information about a GIN index.
>
> It seems fine on the whole, and I have some suggestions.

Thanks for the review!

> 1. This patch applies current git head cleanly, but installation
> ends up with failure because of the lack of
> pgstattuple--1.0--1.1.sql which added in Makefile.

Added pgstattuple--1.0--1.1.sql.

> 2. I feel somewhat uneasy with size for palloc's (it's too long),
> and BuildTupleFromCString used instead of heap_from_tuple.. But
> it would lead additional modification for existent simillars.
>
> You can leave that if you prefer to keep this patch smaller,
> but it looks to me more preferable to construct the result
> tuple not via c-strings in some aspects. (*1)

OK. I changed the code as you suggested.

Updated version of the patch attached.

>
> 3. pgstatginidex shows only version, pending_pages, and
> pendingtuples. Why don't you show the other properties such as
> entry pages, data pages, entries, and total pages as
> pgstatindex does?

I didn't expose those because they are accurate as of last VACUUM.
But if you think they are useful, I don't object to expose them.

Regards,

--
Fujii Masao

Attachment Content-Type Size
pgstatginindex_v2.patch application/octet-stream 13.2 KB

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: masao(dot)fujii(at)gmail(dot)com
Cc: robertmhaas(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: the number of pending entries in GIN index with FASTUPDATE=on
Date: 2012-11-28 02:11:39
Message-ID: 20121128.111139.211353910.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > 1. This patch applies current git head cleanly, but installation
> > ends up with failure because of the lack of
> > pgstattuple--1.0--1.1.sql which added in Makefile.
>
> Added pgstattuple--1.0--1.1.sql.

Good. Installation completed and ALTER EXTENSION UPDATE works
with that.

> > 2. I feel somewhat uneasy with size for palloc's (it's too long),
> > and BuildTupleFromCString used instead of heap_from_tuple.. But
> > it would lead additional modification for existent simillars.
>
> OK. I changed the code as you suggested.

Thank you. It looks simpler than the last one. (although the way
differs to pgstatindex..)

> > 3. pgstatginidex shows only version, pending_pages, and
> > pendingtuples. Why don't you show the other properties such as
> > entry pages, data pages, entries, and total pages as
> > pgstatindex does?
>
> I didn't expose those because they are accurate as of last VACUUM.
> But if you think they are useful, I don't object to expose them.

Ok, my point was the apparent consistency of the functions. I
don't have any distinct wish about this.

I'll mark this as Ready for Committer.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: masao(dot)fujii(at)gmail(dot)com, robertmhaas(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: the number of pending entries in GIN index with FASTUPDATE=on
Date: 2012-12-05 08:08:38
Message-ID: 50BF0106.1000401@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 28.11.2012 04:11, Kyotaro HORIGUCHI wrote:
>>> 3. pgstatginidex shows only version, pending_pages, and
>>> pendingtuples. Why don't you show the other properties such as
>>> entry pages, data pages, entries, and total pages as
>>> pgstatindex does?
>>
>> I didn't expose those because they are accurate as of last VACUUM.
>> But if you think they are useful, I don't object to expose them.
>
> Ok, my point was the apparent consistency of the functions. I
> don't have any distinct wish about this.

We can always add more fields later if there's a need.

> I'll mark this as Ready for Committer.

Thanks, committed.

- Heikki