Re: fate of CLUSTER command ?

Lists: pgsql-hackers
From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: fate of CLUSTER command ?
Date: 2002-08-03 16:47:10
Message-ID: Pine.GSO.4.44.0208031943560.10680-100000@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I just tried CLUSTER command at fts.postgresql.org to cluster
fts index and got very visual performance win. Unfortunately
I had to restore permissions and recreate other indices by hand.
So, I'm interested what's a future of CLUSTER command ?

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


From: Alvaro Herrera <alvherre(at)atentus(dot)com>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fate of CLUSTER command ?
Date: 2002-08-03 18:34:36
Message-ID: Pine.LNX.4.44.0208031432160.7705-100000@cm-lcon1-46-187.cm.vtr.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Oleg Bartunov dijo:

> I just tried CLUSTER command at fts.postgresql.org to cluster
> fts index and got very visual performance win. Unfortunately
> I had to restore permissions and recreate other indices by hand.
> So, I'm interested what's a future of CLUSTER command ?

I'm working on CLUSTER. I have a problem with dependency tracking right
now that I need to get fixed before the patch gets accepted, but that
shouldn't take long (hopefully).

The patch supposedly fixes all the concerns about CLUSTER (permissions,
other indexes, inheritance).

--
Alvaro Herrera (<alvherre[a]atentus.com>)
Officer Krupke, what are we to do?
Gee, officer Krupke, Krup you! (West Side Story, "Gee, Officer Krupke")


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Alvaro Herrera <alvherre(at)atentus(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fate of CLUSTER command ?
Date: 2002-08-03 19:15:28
Message-ID: Pine.GSO.4.44.0208032214191.10680-100000@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 3 Aug 2002, Alvaro Herrera wrote:

> Oleg Bartunov dijo:
>
> > I just tried CLUSTER command at fts.postgresql.org to cluster
> > fts index and got very visual performance win. Unfortunately
> > I had to restore permissions and recreate other indices by hand.
> > So, I'm interested what's a future of CLUSTER command ?
>
> I'm working on CLUSTER. I have a problem with dependency tracking right
> now that I need to get fixed before the patch gets accepted, but that
> shouldn't take long (hopefully).
>
> The patch supposedly fixes all the concerns about CLUSTER (permissions,
> other indexes, inheritance).
>

God news. Will it go to 7.3 ?

>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


From: Alvaro Herrera <alvherre(at)atentus(dot)com>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fate of CLUSTER command ?
Date: 2002-08-03 20:25:32
Message-ID: Pine.LNX.4.44.0208031614030.13128-100000@cm-lcon1-46-187.cm.vtr.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Oleg Bartunov dijo:

> On Sat, 3 Aug 2002, Alvaro Herrera wrote:
>
> > Oleg Bartunov dijo:
> >
> > > I just tried CLUSTER command at fts.postgresql.org to cluster
> > > fts index and got very visual performance win. Unfortunately
> > > I had to restore permissions and recreate other indices by hand.
> > > So, I'm interested what's a future of CLUSTER command ?
> >
> > I'm working on CLUSTER. I have a problem with dependency tracking right
> > now that I need to get fixed before the patch gets accepted, but that
> > shouldn't take long (hopefully).
>
> God news. Will it go to 7.3 ?

In fact, I have just corrected the error and am submitting the patch for
revision and possible inclusion.

Please test it and check if it does what you need. Let me know if it
doesn't, because it should.

--
Alvaro Herrera (<alvherre[a]atentus.com>)
"Find a bug in a program, and fix it, and the program will work today.
Show the program how to find and fix a bug, and the program
will work forever" (Oliver Silfridge)


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fate of CLUSTER command ?
Date: 2002-08-04 00:57:43
Message-ID: 200208040057.g740vhB14596@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Oleg Bartunov wrote:
> I just tried CLUSTER command at fts.postgresql.org to cluster
> fts index and got very visual performance win. Unfortunately
> I had to restore permissions and recreate other indices by hand.
> So, I'm interested what's a future of CLUSTER command ?

Yes, I have always liked CLUSTER with full text searches because you are
usually hitting multiple rows with a single equaltiy restriction, and
CLUSTER puts all the hits on the same page.

If you look in contrib/fulltextindex, you will see mention of CLUSTER in
the README. It may make sense to add that to your documentation.

Also, is there any value to contrib/fulltextindex now that we have
contrib/tsearch?

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fate of CLUSTER command ?
Date: 2002-08-04 01:02:57
Message-ID: 200208040102.g7412vk15283@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Also, let me add that CLUSTER in 7.3 will be fully functional because we
will no longer be changing the oid of the table during cluster. This
will allow people to use CLUSTER more frequently/safely.

---------------------------------------------------------------------------

Bruce Momjian wrote:
> Oleg Bartunov wrote:
> > I just tried CLUSTER command at fts.postgresql.org to cluster
> > fts index and got very visual performance win. Unfortunately
> > I had to restore permissions and recreate other indices by hand.
> > So, I'm interested what's a future of CLUSTER command ?
>
> Yes, I have always liked CLUSTER with full text searches because you are
> usually hitting multiple rows with a single equaltiy restriction, and
> CLUSTER puts all the hits on the same page.
>
> If you look in contrib/fulltextindex, you will see mention of CLUSTER in
> the README. It may make sense to add that to your documentation.
>
> Also, is there any value to contrib/fulltextindex now that we have
> contrib/tsearch?
>
> --
> Bruce Momjian | http://candle.pha.pa.us
> pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
> + If your life is a hard drive, | 830 Blythe Avenue
> + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: "Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
To: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Cc: "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fate of CLUSTER command ?
Date: 2002-08-04 05:34:26
Message-ID: 013501c23b78$99d89ee0$0200a8c0@SOL
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Yes, I have always liked CLUSTER with full text searches because you are
> usually hitting multiple rows with a single equaltiy restriction, and
> CLUSTER puts all the hits on the same page.
>
> If you look in contrib/fulltextindex, you will see mention of CLUSTER in
> the README. It may make sense to add that to your documentation.
>
> Also, is there any value to contrib/fulltextindex now that we have
> contrib/tsearch?

I haven't looked at tsearch yet, but I expect it's way better than
fulltextindex. However there's more than a few of us using fulltextindex,
so I think it will need to stay for some while. I'm working on a new
version of it for 7.3.

I can put pointers in the README about checking out tsearch...

Chris


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
Cc: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>, "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fate of CLUSTER command ?
Date: 2002-08-04 05:41:29
Message-ID: 7704.1028439689@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au> writes:
>> Also, is there any value to contrib/fulltextindex now that we have
>> contrib/tsearch?

> I haven't looked at tsearch yet, but I expect it's way better than
> fulltextindex. However there's more than a few of us using fulltextindex,
> so I think it will need to stay for some while.

Right, at least a couple releases.

> I'm working on a new version of it for 7.3.

What have you got in mind?

regards, tom lane


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fate of CLUSTER command ?
Date: 2002-08-04 07:21:39
Message-ID: Pine.GSO.4.44.0208041006500.15807-100000@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 3 Aug 2002, Bruce Momjian wrote:

> Oleg Bartunov wrote:
> > I just tried CLUSTER command at fts.postgresql.org to cluster
> > fts index and got very visual performance win. Unfortunately
> > I had to restore permissions and recreate other indices by hand.
> > So, I'm interested what's a future of CLUSTER command ?
>
> Yes, I have always liked CLUSTER with full text searches because you are
> usually hitting multiple rows with a single equaltiy restriction, and
> CLUSTER puts all the hits on the same page.
>
> If you look in contrib/fulltextindex, you will see mention of CLUSTER in
> the README. It may make sense to add that to your documentation.
>

I have to play to get feeling. I don't understand what happens if
rows will be added to clustered table. Also, what will happens if
there are several other indices on the same table ? Does clustering
on one index will decrease performance of queries based on another
indices ?

> Also, is there any value to contrib/fulltextindex now that we have
> contrib/tsearch?
>

they 're different things.

>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fate of CLUSTER command ?
Date: 2002-08-04 07:42:33
Message-ID: Pine.GSO.4.44.0208041033390.15807-100000@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 4 Aug 2002, Christopher Kings-Lynne wrote:

> > Yes, I have always liked CLUSTER with full text searches because you are
> > usually hitting multiple rows with a single equaltiy restriction, and
> > CLUSTER puts all the hits on the same page.
> >
> > If you look in contrib/fulltextindex, you will see mention of CLUSTER in
> > the README. It may make sense to add that to your documentation.
> >
> > Also, is there any value to contrib/fulltextindex now that we have
> > contrib/tsearch?
>
> I haven't looked at tsearch yet, but I expect it's way better than
> fulltextindex. However there's more than a few of us using fulltextindex,
> so I think it will need to stay for some while. I'm working on a new
> version of it for 7.3.
>

I'm totally agre with Chris. FTI is something another thing.
FTI is good for more or less static document collection - a cost of
insert if high for inverted indices. We've developed tsearch keeping in
mind incremental update.

FTI should be faster for short queries while tsearch is better for long one.

tsearch development focused also on real IR support - language support,
indexing of specified classes of lexemes , etc. We laready have OpenFTS
which has these features, but we want to move all functionality to
tsearch.

> I can put pointers in the README about checking out tsearch...
>
> Chris
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


From: "Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>, "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fate of CLUSTER command ?
Date: 2002-08-04 07:52:45
Message-ID: 001601c23b8b$edde58a0$0200a8c0@SOL
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > I'm working on a new version of it for 7.3.
>
> What have you got in mind?

Well I have patches from Florian and someone else. Some wide character
stuff, non-indexable word support, full word match search support, speed and
space optimisations, etc.

I'm just trying to set it up in a backwards-compatible way... I want the
contrib to build two separate .so files...

Chris


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fate of CLUSTER command ?
Date: 2002-08-04 14:10:07
Message-ID: 200208041410.g74EA7c01603@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Oleg Bartunov wrote:
> On Sat, 3 Aug 2002, Bruce Momjian wrote:
>
> > Oleg Bartunov wrote:
> > > I just tried CLUSTER command at fts.postgresql.org to cluster
> > > fts index and got very visual performance win. Unfortunately
> > > I had to restore permissions and recreate other indices by hand.
> > > So, I'm interested what's a future of CLUSTER command ?
> >
> > Yes, I have always liked CLUSTER with full text searches because you are
> > usually hitting multiple rows with a single equaltiy restriction, and
> > CLUSTER puts all the hits on the same page.
> >
> > If you look in contrib/fulltextindex, you will see mention of CLUSTER in
> > the README. It may make sense to add that to your documentation.
> >
>
> I have to play to get feeling. I don't understand what happens if
> rows will be added to clustered table. Also, what will happens if
> there are several other indices on the same table ? Does clustering
> on one index will decrease performance of queries based on another
> indices ?

Clustering on one index doesn't decrease the performance of the other
indexes. Also, only >=7.3 will preserve all indexes during cluster.
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: "Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
To: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Cc: "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fate of CLUSTER command ?
Date: 2002-08-05 03:10:31
Message-ID: GNELIHDDFBOCMGBFGEFOOEHPCDAA.chriskl@familyhealth.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Clustering on one index doesn't decrease the performance of the other
> indexes. Also, only >=7.3 will preserve all indexes during cluster.

Sure it must? Since you are rearranging all on-disk rows to match a
particular index (say user_id, username) then it will slow down other
indexes (eg one just on username).

Chris


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fate of CLUSTER command ?
Date: 2002-08-05 03:17:03
Message-ID: 200208050317.g753H3f02537@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Christopher Kings-Lynne wrote:
> > Clustering on one index doesn't decrease the performance of the other
> > indexes. Also, only >=7.3 will preserve all indexes during cluster.
>
> Sure it must? Since you are rearranging all on-disk rows to match a
> particular index (say user_id, username) then it will slow down other
> indexes (eg one just on username).

It will slow down other index scans only if there was some clustering on
those indexes before you ran the CLUSTER command.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: "Ross J(dot) Reedstrom" <reedstrm(at)rice(dot)edu>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fate of CLUSTER command ?
Date: 2002-08-07 05:24:53
Message-ID: 20020807052453.GB5933@rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Aug 04, 2002 at 11:17:03PM -0400, Bruce Momjian wrote:
> Christopher Kings-Lynne wrote:
> > > Clustering on one index doesn't decrease the performance of the other
> > > indexes. Also, only >=7.3 will preserve all indexes during cluster.
> >
> > Sure it must? Since you are rearranging all on-disk rows to match a
> > particular index (say user_id, username) then it will slow down other
> > indexes (eg one just on username).
>
> It will slow down other index scans only if there was some clustering on
> those indexes before you ran the CLUSTER command.

Actually, it would depend on the level of correlation between the values
indexed. If there's some correlation, performance using the second index
could improve some - if they're anti-correlated, it will decrease. If
uncorrelated, there should be no effect.

Ross