Quick Links

Re: I: About "Our CLUSTER implementation is pessimal" patch

From:	Leonardo Francalanci <m_lists(at)yahoo(dot)it>
To:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: I: About "Our CLUSTER implementation is pessimal" patch
Date:	2010-10-04 20:47:30
Message-ID:	863800.83599.qm@web29017.mail.ird.yahoo.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> It sounds like the costing model might need a bit more work before we commit
>this.

I tried again the simple sql tests I posted a while ago, and I still get the
same ratios.
I've tested the applied patch on a dual opteron + disk array Solaris machine.

I really don't get how a laptop hard drive can be faster at reading data using
random
seeks (required by the original cluster method) than seq scan + sort for the 5M
rows
test case.
Same thing for the "cluster vs bloat" test: the seq scan + sort is faster on my
machine.

I've just noticed that Josh used shared_buffers = 16MB for the "cluster vs
bloat" test:
I'm using a much higher shared_buffers (I think something like 200MB), since if
you're working with tables this big I thought it could be a more appropriate
value.
Maybe that's the thing that makes the difference???

Can someone else test the patch?

And: I don't have that deep knowledge of how postgresql deletes rows; but I
thought
that something like:

DELETE FROM mybloat WHERE RANDOM() < 0.9;

would only delete data, not indexes; so the patch should perform even better in
this
case (as it does, in fact, on my test machine), as:

- the original cluster method would read the whole index, and fetch only the
"still alive"
rows
- the new method would read the table using a seq scan, and sort in memory the
few
rows still alive

But, as I said, maybe I'm getting this part wrong...

In response to

Re: I: About "Our CLUSTER implementation is pessimal" patch at 2010-10-01 02:20:48 from Robert Haas

Responses

Re: I: About "Our CLUSTER implementation is pessimal" patch at 2010-10-05 02:21:42 from Josh Kupershmidt

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2010-10-04 20:55:46	Re: ALTER DATABASE RENAME with HS/SR
Previous Message	Alexander Korotkov	2010-10-04 20:19:18	Re: levenshtein_less_equal (was: multibyte charater set in levenshtein function)