Commit fest status

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Commit fest status
Date: 2008-04-11 18:33:40
Message-ID: 14656.1207938820@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

What's left on Bruce's patch queue page is:

* Finishing out Heikki's patch to allow runtime determination of the
need to recheck an index condition. What's committed so far doesn't
yet have any actual use :-(. Although I intend to keep working on
that, it's clearly new development and hence not commit-fest material.

* Design discussions about dead space map, free space map, etc.
I think that we have pretty much converged on a consensus that the
way to store these maps is to add separate subsidiary file(s) for
each relation ("forks", for lack of a better name). And that really
seems to be the only thing we need to decide now --- there's not much
else to talk about until we have some prototype code to experiment
with.

* That thread about "real procedures". I'm not seeing that we need
any further discussion now about that, either. The consensus in the
thread seemed to be that having a PL that could execute "outside
transactions" would be good, but nobody was excited about much else
that was suggested.

In short, I think it's time to declare our first commit fest done.

regards, tom lane


From: Chris Browne <cbbrowne(at)acm(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Commit fest status
Date: 2008-04-11 19:15:04
Message-ID: 60prsww6dj.fsf@dba2.int.libertyrms.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

tgl(at)sss(dot)pgh(dot)pa(dot)us (Tom Lane) writes:
> In short, I think it's time to declare our first commit fest done.

Congratulations!

As a pure observer in the matter, it has clearly been a somewhat
painful process, which must be tempered by the consideration that what
was being reviewed was pretty much a year's worth of work. I think
there's reason to hope that later iterations should be a bit easier
from that perspective alone. And hopefully the "learning curve" means
that things have been learned to ease future pain :-).

Thanks all that have been working on it!
--
let name="cbbrowne" and tld="linuxdatabases.info" in String.concat "@" [name;tld];;
http://linuxfinances.info/info/spiritual.html
Rules of the Evil Overlord #130. "All members of my Legions of Terror
will have professionally tailored uniforms. If the hero knocks a
soldier unconscious and steals the uniform, the poor fit will give him
away." <http://www.eviloverlord.com/>


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Commit fest status
Date: 2008-04-11 19:18:16
Message-ID: 200804111918.m3BJIGk22171@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> What's left on Bruce's patch queue page is:
>
> * Finishing out Heikki's patch to allow runtime determination of the
> need to recheck an index condition. What's committed so far doesn't
> yet have any actual use :-(. Although I intend to keep working on
> that, it's clearly new development and hence not commit-fest material.
>
> * Design discussions about dead space map, free space map, etc.
> I think that we have pretty much converged on a consensus that the
> way to store these maps is to add separate subsidiary file(s) for
> each relation ("forks", for lack of a better name). And that really
> seems to be the only thing we need to decide now --- there's not much
> else to talk about until we have some prototype code to experiment
> with.
>
> * That thread about "real procedures". I'm not seeing that we need
> any further discussion now about that, either. The consensus in the
> thread seemed to be that having a PL that could execute "outside
> transactions" would be good, but nobody was excited about much else
> that was suggested.
>
> In short, I think it's time to declare our first commit fest done.

OK, todo updated, but what about the "Maintaining cluster order on
insert" idea?

http://momjian.us/cgi-bin/pgpatches

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Commit fest status
Date: 2008-04-11 19:33:56
Message-ID: 15811.1207942436@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> Tom Lane wrote:
>> In short, I think it's time to declare our first commit fest done.

> OK, todo updated, but what about the "Maintaining cluster order on
> insert" idea?
> http://momjian.us/cgi-bin/pgpatches

The last item I see in the thread is some performance tests that
make it look not worthwhile. There's no discussion needed, unless
someone refutes that test or improves the code.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Commit fest status
Date: 2008-04-11 19:36:34
Message-ID: 200804111936.m3BJaYB01736@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > Tom Lane wrote:
> >> In short, I think it's time to declare our first commit fest done.
>
> > OK, todo updated, but what about the "Maintaining cluster order on
> > insert" idea?
> > http://momjian.us/cgi-bin/pgpatches
>
> The last item I see in the thread is some performance tests that
> make it look not worthwhile. There's no discussion needed, unless
> someone refutes that test or improves the code.

OK, so we delete it --- fine.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Decibel! <decibel(at)decibel(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Commit fest status
Date: 2008-04-15 17:28:55
Message-ID: 007B5B4F-DEFB-4743-A04A-F6C3CA6FEB13@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Apr 11, 2008, at 2:33 PM, Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>> Tom Lane wrote:
>>> In short, I think it's time to declare our first commit fest done.
>
>> OK, todo updated, but what about the "Maintaining cluster order on
>> insert" idea?
>> http://momjian.us/cgi-bin/pgpatches
>
> The last item I see in the thread is some performance tests that
> make it look not worthwhile. There's no discussion needed, unless
> someone refutes that test or improves the code.

What about Heikki's question to you about insert variations in your
test in http://archives.postgresql.org/pgsql-patches/2007-07/
msg00123.php

Even looking at Heikki's test results I'm still questioning the
validity of the test itself. I don't see any notable difference in
performance in the SELECTs in Heikki's two tests, which makes me
think that the data was being cached somewhere. If my math is
correct, this test should be generating a table that's about 400MB,
so unless you were running this on a 486 or something, it's going to
be cached. I wouldn't expect this test to buy *anything* in the case
of all the data being cached. In fact, it's not going to help at all
if the pages we need to pull for the partial SELECTs are in memory,
which means that for this test to me useful you either need a very
large dataset, or you have to do something to flush the cache before
the SELECT test. If even 50% of the table fits in memory, you could
still very possibly find all the pages you needed already in memory,
which spoils things.

Another issue is I think we need to consider the case of the
usefulness of clustering (unless everyone agrees that it's a very
useful tool that we need), and then consider the performance impact
of this patch on inserts and ways to reduce that.

Towards the former, I've run some tests on some non-spectacular
hardware. I created a table similar to Tom's and populated it via:

create table test (i int, d text);
insert into test SELECT 1000000*random(), repeat('x',350) FROM
generate_series(1,1000000);
create index test_i on test(i);

I then ran test.sh
bin/pg_ctl -D data stop
clearmem 625
bin/pg_ctl -D data start
sleep 15
bin/psql -f test.sql

test.sql:
set enable_bitmapscan To off;
explain analyze select * from test where i between 2000 and 3000;
explain analyze select * from test;

clearmem is something that just allocates a bunch of memory to clear
the cache. Unfortunately I wasn't able to completely clear the cache,
but it was enough to show the benefit of clustering.

I ran that script several times with the table not clustered; the
results were in the 18-20 second range for the between query. For
grins I also tried with bitmapscan on, but results were inconclusive.
I then clustered the table and re-ran the test; response times were
sub-second. Granted, this is on pedestrian hardware, so a good SAN
might not show as big a difference. I can try testing this at work if
there's desire.

So clustering certainly offers a benefit. Is there some way we can
improve the patch to reduce the impact to INSERT?
--
Decibel!, aka Jim C. Nasby, Database Architect decibel(at)decibel(dot)org
Give your computer some brain candy! www.distributed.net Team #1828