Re: Fix for Index Advisor related hooks

Lists: pgsql-hackers
From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Fix for Index Advisor related hooks
Date: 2011-02-11 20:44:22
Message-ID: AANLkTikHUBrx5o==7O1Tk7i+Jnhrox3yh+j0igcbYU--@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Looks like the function get_actual_variable_range() was written with the
knowledge that virtual/hypothetical indexes may exist, but the assumption
seems wrong.

One one hand get_actual_variable_range() expects that virtual indexes do not
have an OID assigned, on the other hand explain_get_index_name_hook() is
handed just an index's OID to get its name back; IMHO these are based on two
conflicting assumptions about whether a virtual index will have an OID
assigned.

Attached patch fix_get_actual_variable_range.patch tries to fix this by
introducing a new hook that can help Postgres decide if an index is
fictitious or not.

Also attached is the patch expose_IndexSupportInitialize.patch, that makes
the static function IndexSupportInitialize() global so that the Index
Advisor doesn't have to reinvent the wheel to prepare an index structure
with opfamilies and opclasses.

Regards,
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh(dot)gurjeet(at){ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device

Attachment Content-Type Size
fix_get_actual_variable_range.patch text/x-patch 1.7 KB
expose_IndexSupportInitialize.patch text/x-patch 2.0 KB

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
Cc: PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-15 13:24:26
Message-ID: 4D5A7E8A.9010008@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11.02.2011 22:44, Gurjeet Singh wrote:
> Looks like the function get_actual_variable_range() was written with the
> knowledge that virtual/hypothetical indexes may exist, but the assumption
> seems wrong.
>
> One one hand get_actual_variable_range() expects that virtual indexes do not
> have an OID assigned, on the other hand explain_get_index_name_hook() is
> handed just an index's OID to get its name back; IMHO these are based on two
> conflicting assumptions about whether a virtual index will have an OID
> assigned.
>
> Attached patch fix_get_actual_variable_range.patch tries to fix this by
> introducing a new hook that can help Postgres decide if an index is
> fictitious or not.

The new hook takes an index oid as argument, so I gather that you
resolved the contradiction by deciding that fictitious indexes have
OIDs. How do you assign those OIDs? Do fictitious indexes have entries
in pg_index?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-15 13:56:06
Message-ID: AANLkTi=QXqM6a8TQYztP4mdRthq6x7kD67_yYFMpqnTS@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 15, 2011 at 8:24 AM, Heikki Linnakangas <
heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:

> On 11.02.2011 22:44, Gurjeet Singh wrote:
>
>> Looks like the function get_actual_variable_range() was written with the
>> knowledge that virtual/hypothetical indexes may exist, but the assumption
>> seems wrong.
>>
>> One one hand get_actual_variable_range() expects that virtual indexes do
>> not
>> have an OID assigned, on the other hand explain_get_index_name_hook() is
>> handed just an index's OID to get its name back; IMHO these are based on
>> two
>> conflicting assumptions about whether a virtual index will have an OID
>> assigned.
>>
>> Attached patch fix_get_actual_variable_range.patch tries to fix this by
>> introducing a new hook that can help Postgres decide if an index is
>> fictitious or not.
>>
>
> The new hook takes an index oid as argument, so I gather that you resolved
> the contradiction by deciding that fictitious indexes have OIDs. How do you
> assign those OIDs? Do fictitious indexes have entries in pg_index?<http://www.enterprisedb.com>

No, a fictitious index does not touch pg_index. The Index Advisor uses
GetNewOid(pg_class) to generate a new OID for the fictitious index.

An OID is the only way we can identify one fictitious index from the list of
all the others fictitious ones when explain_get_index_name_hook() is called.
I don't see any other way the Postgres server can ask the for the details of
a fictitious index.

Regards,
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh(dot)gurjeet(at){ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
Cc: PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-15 16:59:54
Message-ID: 19631.1297789194@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com> writes:
> Also attached is the patch expose_IndexSupportInitialize.patch, that makes
> the static function IndexSupportInitialize() global so that the Index
> Advisor doesn't have to reinvent the wheel to prepare an index structure
> with opfamilies and opclasses.

We are *not* doing that. That's a private function and will remain so.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-15 17:51:31
Message-ID: 26721.1297792291@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com> writes:
> On Tue, Feb 15, 2011 at 8:24 AM, Heikki Linnakangas <
> heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> On 11.02.2011 22:44, Gurjeet Singh wrote:
>>> One one hand get_actual_variable_range() expects that virtual indexes do
>>> not
>>> have an OID assigned, on the other hand explain_get_index_name_hook() is
>>> handed just an index's OID to get its name back; IMHO these are based on
>>> two
>>> conflicting assumptions about whether a virtual index will have an OID
>>> assigned.

>> The new hook takes an index oid as argument, so I gather that you resolved
>> the contradiction by deciding that fictitious indexes have OIDs. How do you
>> assign those OIDs? Do fictitious indexes have entries in pg_index?

> No, a fictitious index does not touch pg_index. The Index Advisor uses
> GetNewOid(pg_class) to generate a new OID for the fictitious index.

That seems like a very expensive, and lock-inducing, way of assigning a
fictitious OID. They don't need to be globally unique. I suggest you
consider the idea I suggested back in 2007:

* In this toy example we just assign all hypothetical indexes
* OID 0, and the explain_get_index_name hook just prints
* <hypothetical index>. In a realistic situation we'd probably
* assume that OIDs smaller than, say, 100 are never the OID of
* any real index, allowing us to identify one of up to 100
* hypothetical indexes per plan. Then we'd need to save aside
* some state data that would let the explain hooks print info
* about the selected index.

As far as the immediate problem goes, I agree that
get_actual_variable_range is mistaken, but I think a cleaner and cheaper
solution would be to add a bool "hypothetical" to IndexOptInfo.

regards, tom lane


From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-16 14:15:32
Message-ID: AANLkTinRFCWMweztTCfY5-9TMnjxfgCdKCQ9r+HT85R4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 15, 2011 at 11:59 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com> writes:
> > Also attached is the patch expose_IndexSupportInitialize.patch, that
> makes
> > the static function IndexSupportInitialize() global so that the Index
> > Advisor doesn't have to reinvent the wheel to prepare an index structure
> > with opfamilies and opclasses.
>
> We are *not* doing that. That's a private function and will remain so.
>

I understand that we need to hide guts of an implementation. But without
this the Index Advisor will have to emulate what LookupOpclassInfo() does
and that's a lot of code that I am afraid, if emulated by another function
in Index Advisor, is more prone to obsolecence than calling
IndexSupportInitialize().

Regards,
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh(dot)gurjeet(at){ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device


From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-16 14:31:44
Message-ID: AANLkTikJ2FoECrxX81LvUySyt_hv04yoq5QBx0Yz=xp_@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 15, 2011 at 12:51 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com> writes:
> > On Tue, Feb 15, 2011 at 8:24 AM, Heikki Linnakangas <
> > heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> >> On 11.02.2011 22:44, Gurjeet Singh wrote:
> >>> One one hand get_actual_variable_range() expects that virtual indexes
> do
> >>> not
> >>> have an OID assigned, on the other hand explain_get_index_name_hook()
> is
> >>> handed just an index's OID to get its name back; IMHO these are based
> on
> >>> two
> >>> conflicting assumptions about whether a virtual index will have an OID
> >>> assigned.
>
> >> The new hook takes an index oid as argument, so I gather that you
> resolved
> >> the contradiction by deciding that fictitious indexes have OIDs. How do
> you
> >> assign those OIDs? Do fictitious indexes have entries in pg_index?
>
> > No, a fictitious index does not touch pg_index. The Index Advisor uses
> > GetNewOid(pg_class) to generate a new OID for the fictitious index.
>
> That seems like a very expensive, and lock-inducing, way of assigning a
> fictitious OID. They don't need to be globally unique.

They need to be unique for a run a session, and be distinguishable from
normal indexes so that explain_get_index_name_hook() can get a generated
name for the hypothetical index.

> I suggest you
> consider the idea I suggested back in 2007:
>
> * In this toy example we just assign all hypothetical indexes
> * OID 0, and the explain_get_index_name hook just prints
> * <hypothetical index>. In a realistic situation we'd probably
> * assume that OIDs smaller than, say, 100 are never the OID of
> * any real index, allowing us to identify one of up to 100
> * hypothetical indexes per plan. Then we'd need to save aside
> * some state data that would let the explain hooks print info
> * about the selected index.
>
> As far as the immediate problem goes, I agree that
> get_actual_variable_range is mistaken, but I think a cleaner and cheaper
> solution would be to add a bool "hypothetical" to IndexOptInfo.
>

Currently there are 2 sites interested in knowing if an index is
hypothetical:

1) explain_get_index_name_hook
2) get_actual_variable_range()

With this bool isHypothetical in place, explain_get_index_name() would
be unchanged, and the Index Advisor can use a locally generated Oid (not
from pg_class) to uniquely identify a hypothetical index.

And get_actual_variable_range() would be changed so:

- if (!OidIsValid(index->indexoid))
+ if (index->ishypothetical)

I can code submit a patch for that.

BTW, you use the term 'fictitious' in the comments, would it be better
to standardize the term used for such an index? So either the comment would
be changed to call it hypothetical, or the structure member would be changed
to isfictitious.

Regards,
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh(dot)gurjeet(at){ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
Cc: PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-16 15:25:08
Message-ID: 24951.1297869908@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com> writes:
> I understand that we need to hide guts of an implementation. But without
> this the Index Advisor will have to emulate what LookupOpclassInfo() does
> and that's a lot of code that I am afraid, if emulated by another function
> in Index Advisor, is more prone to obsolecence than calling
> IndexSupportInitialize().

The only reason you'd need that code is if you were trying to construct
a fake Relation structure, which seems unnecessary and undesirable.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-16 15:26:47
Message-ID: 24996.1297870007@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com> writes:
> BTW, you use the term 'fictitious' in the comments, would it be better
> to standardize the term used for such an index? So either the comment would
> be changed to call it hypothetical, or the structure member would be changed
> to isfictitious.

Yeah, hypothetical is the more-established term I think.

regards, tom lane


From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-16 18:30:35
Message-ID: AANLkTinrpTy-=ZENFp1VitwkFvkfrJVFVyND0fp+L1YN@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 16, 2011 at 10:25 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com> writes:
> > I understand that we need to hide guts of an implementation. But without
> > this the Index Advisor will have to emulate what LookupOpclassInfo() does
> > and that's a lot of code that I am afraid, if emulated by another
> function
> > in Index Advisor, is more prone to obsolecence than calling
> > IndexSupportInitialize().
>
> The only reason you'd need that code is if you were trying to construct
> a fake Relation structure, which seems unnecessary and undesirable.
>

The planner requires IndexOptInfo, and for the planner to choose the
hypothetical index we need to fill in the fwdsortop, revsortop, opfamily and
opcintype, and this is the information that IndexAdvisor populates using
IndexSupportInitialize() (at least until c0b5fac7 changed the function
signature.

I am trying to populate an IndexOptInfo just like get_relation_info() does
after the 'info = makeNode(IndexOptInfo);' line.

What would be the best way to build an IndexOptInfo for a plain BTREE index
for different data types?

Regards,
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh(dot)gurjeet(at){ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device


From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-16 23:02:18
Message-ID: AANLkTikQCagn45DQwN8RN+SLKdS90UH7E2bW6++-YwPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 16, 2011 at 10:26 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com> writes:
> > BTW, you use the term 'fictitious' in the comments, would it be
> better
> > to standardize the term used for such an index? So either the comment
> would
> > be changed to call it hypothetical, or the structure member would be
> changed
> > to isfictitious.
>
> Yeah, hypothetical is the more-established term I think.
>

Please find the patch attached.

Regards,
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh(dot)gurjeet(at){ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device

Attachment Content-Type Size
fix_get_actual_variable_range.patch text/x-patch 1.2 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
Cc: PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-16 23:37:05
Message-ID: 11049.1297899425@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com> writes:
> On Wed, Feb 16, 2011 at 10:25 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> The only reason you'd need that code is if you were trying to construct
>> a fake Relation structure, which seems unnecessary and undesirable.

> The planner requires IndexOptInfo, and for the planner to choose the
> hypothetical index we need to fill in the fwdsortop, revsortop, opfamily and
> opcintype, and this is the information that IndexAdvisor populates using
> IndexSupportInitialize() (at least until c0b5fac7 changed the function
> signature.

Yeah, and the set of stuff you need in IndexOptInfo changed again last
week; see collations. Direct access to IndexSupportInitialize is even
less useful today than it was a week ago. This stuff has changed many
times before, and it will change again in the future, and exporting a
private function that has an unrelated purpose is not going to insulate
you from needing to deal with that.

> What would be the best way to build an IndexOptInfo for a plain BTREE index
> for different data types?

Fetch the values you need and stuff 'em in the struct. Don't expect
relcache to do it for you. The only reason relcache is involved in the
current workflow is that we try to cache the information across queries
in order to save on planner startup time ... but I don't think that that
concern is nearly as pressing for something like Index Advisor. You'll
have enough to do tracking changes in IndexOptInfo, you don't need to
have to deal with refactorings inside relcache as well.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-17 00:25:54
Message-ID: 17922.1297902354@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com> writes:
> On Wed, Feb 16, 2011 at 10:26 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Yeah, hypothetical is the more-established term I think.

> Please find the patch attached.

Applied with minor adjustments to HEAD and 9.0.

regards, tom lane


From: Thom Brown <thom(at)linux(dot)com>
To: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-17 00:39:50
Message-ID: AANLkTiktOSHDM3iQrt5kv3oYxfukx_d2emoMZky6yD=-@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 16 February 2011 23:02, Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com> wrote:
> On Wed, Feb 16, 2011 at 10:26 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>> Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com> writes:
>> >     BTW, you use the term 'fictitious' in the comments, would it be
>> > better
>> > to standardize the term used for such an index? So either the comment
>> > would
>> > be changed to call it hypothetical, or the structure member would be
>> > changed
>> > to isfictitious.
>>
>> Yeah, hypothetical is the more-established term I think.
>
> Please find the patch attached.

For my benefit, could you explain how ishypothetical gets set to true?

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thom Brown <thom(at)linux(dot)com>
Cc: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-17 00:48:55
Message-ID: 18465.1297903735@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thom Brown <thom(at)linux(dot)com> writes:
> For my benefit, could you explain how ishypothetical gets set to true?

In the core, it never does. An index advisor plugin would set it in
IndexOptInfo structs that it makes.

regards, tom lane


From: Thom Brown <thom(at)linux(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-17 00:53:19
Message-ID: AANLkTi=DqPXZEWWxFYvOPzvEJNnGgUES_cLzoTGioBPR@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 17 February 2011 00:48, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Thom Brown <thom(at)linux(dot)com> writes:
>> For my benefit, could you explain how ishypothetical gets set to true?
>
> In the core, it never does.  An index advisor plugin would set it in
> IndexOptInfo structs that it makes.

I get the idea. Thanks Tom.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935


From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-17 01:25:55
Message-ID: AANLkTimaHKE4=8t7fEzszyTzLTrXUy3dguq2_+PQpJFo@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 16, 2011 at 7:25 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com> writes:
> > On Wed, Feb 16, 2011 at 10:26 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Yeah, hypothetical is the more-established term I think.
>
> > Please find the patch attached.
>
> Applied with minor adjustments to HEAD and 9.0.
>

Thanks Tom.
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh(dot)gurjeet(at){ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device


From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-17 12:30:05
Message-ID: AANLkTikn1uTQrQZcJ_kE=ZT_Lctp4UL=kO1h6c2XJL4O@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 16, 2011 at 6:37 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com> writes:
> > On Wed, Feb 16, 2011 at 10:25 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> The only reason you'd need that code is if you were trying to construct
> >> a fake Relation structure, which seems unnecessary and undesirable.
>
> > The planner requires IndexOptInfo, and for the planner to choose the
> > hypothetical index we need to fill in the fwdsortop, revsortop, opfamily
> and
> > opcintype, and this is the information that IndexAdvisor populates using
> > IndexSupportInitialize() (at least until c0b5fac7 changed the function
> > signature.
>
> Yeah, and the set of stuff you need in IndexOptInfo changed again last
> week; see collations. Direct access to IndexSupportInitialize is even
> less useful today than it was a week ago. This stuff has changed many
> times before, and it will change again in the future, and exporting a
> private function that has an unrelated purpose is not going to insulate
> you from needing to deal with that.
>

I guess you are right.

>
> > What would be the best way to build an IndexOptInfo for a plain BTREE
> index
> > for different data types?
>
> Fetch the values you need and stuff 'em in the struct. Don't expect
> relcache to do it for you. The only reason relcache is involved in the
> current workflow is that we try to cache the information across queries
> in order to save on planner startup time ... but I don't think that that
> concern is nearly as pressing for something like Index Advisor. You'll
> have enough to do tracking changes in IndexOptInfo, you don't need to
> have to deal with refactorings inside relcache as well.
>

I also wish to make Index Advisor faster by not having to lookup this info
every time a new query comes in and that's why I was trying to use the guts
of IndexSupportInitialize() where it does the caching. If Index Advisor went
on its own then we'll be implementing caching of opfamily and opcintype etc
in the contrib/ code. And I am pretty sure we can't do it any better than
what Postgres is currently doing in terms of managing that cache and
possibly invalidating it when some relevant DDL happens.

Would it be possible to somehow expose that cache managed by
LookupOpclassInfo()? I see the comments above it note that it does not
handle invalidation since there's no need for it because currently one
cannot ALTER an opclass. But I do not wish to be revisiting this problem
when that changes. IOW, when ALTER for opclass is implemented,
LookupOpclassInfo would be changed to handle invalidation and I wish to
leverage that; it might mean change in function signature, but I guess Index
Advisor will have to live with that.

Thanks,
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh(dot)gurjeet(at){ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-18 07:27:54
Message-ID: 4D5E1F7A.4070903@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 17.02.2011 14:30, Gurjeet Singh wrote:
> On Wed, Feb 16, 2011 at 6:37 PM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> Gurjeet Singh<singh(dot)gurjeet(at)gmail(dot)com> writes:
>>> On Wed, Feb 16, 2011 at 10:25 AM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> The only reason you'd need that code is if you were trying to construct
>>>> a fake Relation structure, which seems unnecessary and undesirable.
>>
>>> The planner requires IndexOptInfo, and for the planner to choose the
>>> hypothetical index we need to fill in the fwdsortop, revsortop, opfamily
>> and
>>> opcintype, and this is the information that IndexAdvisor populates using
>>> IndexSupportInitialize() (at least until c0b5fac7 changed the function
>>> signature.
>>
>> Yeah, and the set of stuff you need in IndexOptInfo changed again last
>> week; see collations. Direct access to IndexSupportInitialize is even
>> less useful today than it was a week ago. This stuff has changed many
>> times before, and it will change again in the future, and exporting a
>> private function that has an unrelated purpose is not going to insulate
>> you from needing to deal with that.
>>
>
> I guess you are right.
>
>
>>
>>> What would be the best way to build an IndexOptInfo for a plain BTREE
>> index
>>> for different data types?
>>
>> Fetch the values you need and stuff 'em in the struct. Don't expect
>> relcache to do it for you. The only reason relcache is involved in the
>> current workflow is that we try to cache the information across queries
>> in order to save on planner startup time ... but I don't think that that
>> concern is nearly as pressing for something like Index Advisor. You'll
>> have enough to do tracking changes in IndexOptInfo, you don't need to
>> have to deal with refactorings inside relcache as well.
>>
>
> I also wish to make Index Advisor faster by not having to lookup this info
> every time a new query comes in and that's why I was trying to use the guts
> of IndexSupportInitialize() where it does the caching.

I doubt performance matters much here. It's not like you're going to be
explaining hundreds of queries per second with a hypotethical index
installed. I think of this as a manual tool that you run from a GUI when
you wonder if an index on column X would help.

The question is, can the Index Advisor easilt access all the information
it needs to build the IndexOptInfo? If not, what's missing?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-18 15:02:38
Message-ID: AANLkTim6GLvYcYQ1bVqPLRNGMYkWKB9c7DioQJMn3WRP@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 18, 2011 at 2:27 AM, Heikki Linnakangas <
heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:

> On 17.02.2011 14:30, Gurjeet Singh wrote:
>
>> On Wed, Feb 16, 2011 at 6:37 PM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>> Fetch the values you need and stuff 'em in the struct. Don't expect
>>> relcache to do it for you. The only reason relcache is involved in the
>>> current workflow is that we try to cache the information across queries
>>> in order to save on planner startup time ... but I don't think that that
>>> concern is nearly as pressing for something like Index Advisor. You'll
>>> have enough to do tracking changes in IndexOptInfo, you don't need to
>>> have to deal with refactorings inside relcache as well.
>>>
>>>
>> I also wish to make Index Advisor faster by not having to lookup this info
>> every time a new query comes in and that's why I was trying to use the
>> guts
>> of IndexSupportInitialize() where it does the caching.
>>
>
> I doubt performance matters much here. It's not like you're going to be
> explaining hundreds of queries per second with a hypotethical index
> installed. I think of this as a manual tool that you run from a GUI when you
> wonder if an index on column X would help.
>
> The question is, can the Index Advisor easilt access all the information it
> needs to build the IndexOptInfo? If not, what's missing?

There's a command line tool in the Index Adviser contrib that takes a file
full of SQL and run them against the Index Adviser. People would want that
tool to be as fast as it can be.

Another use case of the Index Advisor is to be switched on for a few hours
while the application runs, and gather the recommendations for the whole
run. We'll need good performance that case too.

I'll try to figure out what all info we need for IndexOptInfo, it'll take
some time though.

Regards,
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh(dot)gurjeet(at){ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-18 18:36:29
Message-ID: 4D5EBC2D.5080205@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 18.02.2011 17:02, Gurjeet Singh wrote:
> On Fri, Feb 18, 2011 at 2:27 AM, Heikki Linnakangas<
> heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>
>> On 17.02.2011 14:30, Gurjeet Singh wrote:
>>
>>> On Wed, Feb 16, 2011 at 6:37 PM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>
>>> Fetch the values you need and stuff 'em in the struct. Don't expect
>>>> relcache to do it for you. The only reason relcache is involved in the
>>>> current workflow is that we try to cache the information across queries
>>>> in order to save on planner startup time ... but I don't think that that
>>>> concern is nearly as pressing for something like Index Advisor. You'll
>>>> have enough to do tracking changes in IndexOptInfo, you don't need to
>>>> have to deal with refactorings inside relcache as well.
>>>>
>>>>
>>> I also wish to make Index Advisor faster by not having to lookup this info
>>> every time a new query comes in and that's why I was trying to use the
>>> guts
>>> of IndexSupportInitialize() where it does the caching.
>>>
>>
>> I doubt performance matters much here. It's not like you're going to be
>> explaining hundreds of queries per second with a hypotethical index
>> installed. I think of this as a manual tool that you run from a GUI when you
>> wonder if an index on column X would help.
>>
>> The question is, can the Index Advisor easilt access all the information it
>> needs to build the IndexOptInfo? If not, what's missing?
>
>
> There's a command line tool in the Index Adviser contrib that takes a file
> full of SQL and run them against the Index Adviser. People would want that
> tool to be as fast as it can be.

Nah, I don't buy that, sounds like a premature optimization. Just
planning a bunch of SQL statements, even if there's hundreds of them in
the file, shouldn't take that long. And even if it does, I don't believe
without evidence that the catalog lookups for the hypothetical indexes
is where the time is spent.

> Another use case of the Index Advisor is to be switched on for a few hours
> while the application runs, and gather the recommendations for the whole
> run. We'll need good performance that case too.

How exactly does that work? I would imagine that you log all the
different SQL statements and how often they're run during that period.
Similar to pgFouine, for example. And only then you run the index
advisor on the collected SQL statements.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-18 18:49:32
Message-ID: 12191.1298054972@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> On 18.02.2011 17:02, Gurjeet Singh wrote:
>>> On Wed, Feb 16, 2011 at 6:37 PM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> Fetch the values you need and stuff 'em in the struct. Don't expect
>>>> relcache to do it for you.

>>> I also wish to make Index Advisor faster by not having to lookup this info
>>> every time a new query comes in and that's why I was trying to use the
>>> guts of IndexSupportInitialize() where it does the caching.

> Nah, I don't buy that, sounds like a premature optimization. Just
> planning a bunch of SQL statements, even if there's hundreds of them in
> the file, shouldn't take that long. And even if it does, I don't believe
> without evidence that the catalog lookups for the hypothetical indexes
> is where the time is spent.

But even more to the point, IndexSupportInitialize is simply not well
matched to the problem. It's designed to help produce a relcache entry
from a pg_index entry, and in particular to look up opfamily and input
datatype from an opclass OID. (Oh, and to produce info about index
support functions, which you certainly don't need.) But as far as I can
see, an index advisor would already *have* opfamily and input datatype,
because what it's going to be starting from is some query WHERE
condition that it thinks would be worth optimizing. What it's going to
get from looking up that operator in pg_amop is opfamily and opcintype
information. Looking up an opclass from that is unnecessary work as far
as I can see (you don't need it to put in IndexOptInfo, for sure), and
reversing the lookup afterwards is certainly pointless.

So even granted that performance is critical, you haven't made a case
why exposing IndexSupportInitialize is going to be useful.

regards, tom lane


From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for Index Advisor related hooks
Date: 2011-02-23 17:49:28
Message-ID: AANLkTikFoog3ygvfy2+QEEg35Ue5mDDyO3uKJLfO4SFq@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 18, 2011 at 1:36 PM, Heikki Linnakangas <
heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:

> On 18.02.2011 17:02, Gurjeet Singh wrote:
>
> Another use case of the Index Advisor is to be switched on for a few hours
>> while the application runs, and gather the recommendations for the whole
>> run. We'll need good performance that case too.
>>
>
> How exactly does that work? I would imagine that you log all the different
> SQL statements and how often they're run during that period. Similar to
> pgFouine, for example. And only then you run the index advisor on the
> collected SQL statements.

The Index Advisor produces recommendations for every running query on the
fly and stores them in a table. After the application run is over, these
recommendations can be viewed in the table and analyzed to pick the indexes
that provide the most benefit.

Regards,
--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh(dot)gurjeet(at){ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device