LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris

Lists: pgsql-hackers
From: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
To: pgsql-hackers(at)postgresql(dot)org
Subject: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date: 2007-11-13 16:01:21
Message-ID: 4739CA51.8060406@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

I am running tests with PG8.3b2 on Solaris 10 8/07 and I still see IO
flood when checkpoint happens.
I have tried increasing the bg_lru_multiplier from 2 to 5 from default
but I dont see any more writes by bgwriter happening than my previous
test which used the default.

Then I tried increasing checkpoint_completion_target=0.9 but still no
spread of IO (checkpoint_timeout is set to default 5m)

What am I missing?

How does PostgreSQL determine the Load distribution?

Regards,
Jignesh


From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date: 2007-11-13 16:32:01
Message-ID: 4739D181.7010208@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jignesh K. Shah wrote:
> I am running tests with PG8.3b2 on Solaris 10 8/07 and I still see IO
> flood when checkpoint happens.
> I have tried increasing the bg_lru_multiplier from 2 to 5 from default
> but I dont see any more writes by bgwriter happening than my previous
> test which used the default.
>
> Then I tried increasing checkpoint_completion_target=0.9 but still no
> spread of IO (checkpoint_timeout is set to default 5m)
>
> What am I missing?

Two things spring to mind:

Even though the write()s are distributed, the fsyncs are still going to
cause a spike. But it should be much less severe than before. How bad
are the spikes you're seeing? Compared to 8.2?

Are your checkpoints triggered by checkpoint_timeout or
checkpoint_segments? The calculation of "how much time do I have to
finish this checkpoint before the next one is triggered" takes both into
account, but the calculation wrt. checkpoint_segments is not very
accurate because of full page writes. Because of full page writes, we
write a lot more WAL right after checkpoint. That makes the load
distribution algorithm to think it's going to run out of
checkpoint_segments much sooner than it actually does. Raising
checkpoint_segments will help with that.

BTW, please turn on checkpoint_logging.

> How does PostgreSQL determine the Load distribution?

First, when the checkpoint starts, shared buffer pool is scanned, dirty
buffers are counted. Then bgwriter starts to write the buffers,
constantly estimating how much work it has left, and how much time it
has until next checkpoint. The write rate is throttled so that the
remaining writes are distributed evenly across the time remaining
(checkpoint_completion_target is a fuzz-factor applied to the estimate
of how much time is remaining).

Hope this helps..

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date: 2007-11-13 16:37:29
Message-ID: 17597.1194971849@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Jignesh K. Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM> writes:
> I am running tests with PG8.3b2 on Solaris 10 8/07 and I still see IO
> flood when checkpoint happens.

I am thinking that you are probably trying to test that by issuing
manual CHECKPOINT commands. A manual checkpoint is still done at full
speed, as are shutdown checkpoints. You need to study the behavior of
automatic (background) checkpoints, instead.

regards, tom lane


From: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date: 2007-11-13 16:57:00
Message-ID: 4739D75C.1050407@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I am running EAStress workload.. which doesnt do manual checkpoints as
far as I know..

I will turn on checkpoint_logging to get more idea as Heikki suggested

thanks.

-Jignesh

Tom Lane wrote:
> "Jignesh K. Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM> writes:
>
>> I am running tests with PG8.3b2 on Solaris 10 8/07 and I still see IO
>> flood when checkpoint happens.
>>
>
> I am thinking that you are probably trying to test that by issuing
> manual CHECKPOINT commands. A manual checkpoint is still done at full
> speed, as are shutdown checkpoints. You need to study the behavior of
> automatic (background) checkpoints, instead.
>
> regards, tom lane
>


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date: 2007-11-13 19:19:25
Message-ID: Pine.GSO.4.64.0711131414230.15021@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 13 Nov 2007, Jignesh K. Shah wrote:

> I have tried increasing the bg_lru_multiplier from 2 to 5 from default but I
> dont see any more writes by bgwriter happening than my previous test which
> used the default.

The multiplier only impacts writes being done by the LRU eviction
mechanism; it doesn't change anything related to the checkpoint writes.
In fact, since those LRU writes are going on at the same time as the
checkpoint ones, increasing the multiplier too much can make the
checkpoint I/O spike worse. It's unlikely that higher values will
decrease the spike.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date: 2007-11-14 08:55:47
Message-ID: 20071114174139.30F3.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


"Jignesh K. Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM> wrote:

> I am running tests with PG8.3b2 on Solaris 10 8/07 and I still see IO
> flood when checkpoint happens.

Are there any i/o tuning knobs in Solaris? LDC in 8.3 expects writing
activity in kernel is strong enough to keep dirty pages in kernel
in a small number. i.e, fsync should be cheap.

Larger shared_buffers, checkpoint_timeout, checkpoint_segments and
checkpoint_completion_target make checkpoints more smooth.
But there is a trade-off between smoothness and recovery time.

Stronger writer processes in kernel make checkpoints more smooth.
But there is a trade-off between smoothness and throughput.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date: 2007-11-14 23:24:26
Message-ID: 18062.1195082666@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Jignesh K. Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM> writes:
> I will turn on checkpoint_logging to get more idea as Heikki suggested

Did you find out anything?

Did this happen on every checkpoint, or only some of them? The bug
Itagaki-san pointed out today in IsCheckpointOnSchedule might account
for some checkpoints being done at full speed, but not all ...

regards, tom lane


From: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date: 2007-11-14 23:33:43
Message-ID: 473B85D7.5060902@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


I was waiting to digest what I saw before sending it to the group

I am running EAStress workload

I am using odata_sync which should sync as soon as it is written

with checkpoint_completion_target=0.9 and checkpoint_time=5m it seems to
be doing the right thing from the logfile output

2007-11-13 09:20:49.070 PST 9180 LOG: checkpoint starting: time
2007-11-13 09:21:13.808 PST 9458 LOG: automatic analyze of table
"specdb.public.o_orderline" system usage: CPU 0.03s/0.50u sec elapsed
7.79 sec
2007-11-13 09:21:19.830 PST 9458 LOG: automatic vacuum of table
"specdb.public.txn_log_table": index scans: 1
pages: 11 removed, 105 remain
tuples: 3147 removed, 40 remain
system usage: CPU 0.11s/0.09u sec elapsed 6.02 sec
2007-11-13 09:22:12.112 PST 9462 LOG: automatic vacuum of table
"specdb.public.txn_log_table": index scans: 1
pages: 28 removed, 77 remain
tuples: 1990 removed, 95 remain
system usage: CPU 0.11s/0.09u sec elapsed 5.98 sec
2007-11-13 09:23:12.121 PST 9466 LOG: automatic vacuum of table
"specdb.public.txn_log_table": index scans: 1
pages: 0 removed, 77 remain
tuples: 3178 removed, 128 remain
system usage: CPU 0.11s/0.04u sec elapsed 5.87 sec
2007-11-13 09:24:12.220 PST 9470 LOG: automatic vacuum of table
"specdb.public.txn_log_table": index scans: 1
pages: 0 removed, 77 remain
tuples: 3394 removed, 57 remain
system usage: CPU 0.11s/0.04u sec elapsed 5.85 sec
2007-11-13 09:25:12.400 PST 9474 LOG: automatic vacuum of table
"specdb.public.txn_log_table": index scans: 1
pages: 0 removed, 77 remain
tuples: 3137 removed, 1 remain
system usage: CPU 0.11s/0.04u sec elapsed 5.93 sec
2007-11-13 09:25:18.723 PST 9180 LOG: checkpoint complete: wrote 33362
buffers (2.2%); 0 transaction log file(s) added, 0 removed, 0 recycled;
write=269.642 s, sync=0.003 s, total=269.653 s
2007-11-13 09:25:49.000 PST 9180 LOG: checkpoint starting: time

However actual iostat output still shows non-uniform distribution but I
havent put the exact time stamp on the iostat outputs to correlate that
with the logfile entries.. Maybe I should do that.

So from the PostgreSQL view things are doing fine based on outputs: I
need to figure out the Solaris view on it now.

Could it be related to autovacuum happening also?

Regards,
Jignesh

Tom Lane wrote:
> "Jignesh K. Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM> writes:
>
>> I will turn on checkpoint_logging to get more idea as Heikki suggested
>>
>
> Did you find out anything?
>
> Did this happen on every checkpoint, or only some of them? The bug
> Itagaki-san pointed out today in IsCheckpointOnSchedule might account
> for some checkpoints being done at full speed, but not all ...
>
> regards, tom lane
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date: 2007-11-15 00:17:05
Message-ID: 18797.1195085825@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Jignesh K. Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM> writes:
> So from the PostgreSQL view things are doing fine based on outputs: I
> need to figure out the Solaris view on it now.

> Could it be related to autovacuum happening also?

Maybe ... have you tried fiddling with the vacuum_cost_delay options?

Looking at the autovacuum log output,

> 2007-11-13 09:21:19.830 PST 9458 LOG: automatic vacuum of table
> "specdb.public.txn_log_table": index scans: 1
> pages: 11 removed, 105 remain
> tuples: 3147 removed, 40 remain
> system usage: CPU 0.11s/0.09u sec elapsed 6.02 sec

it seems like a serious omission that this gives you no hint how many
pages were scanned.

regards, tom lane


From: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date: 2007-11-15 01:33:39
Message-ID: 473BA1F3.60504@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I dont understand vacuum a lot.. I admit I am stupid :-)

When you say scanned... do you mean reads or do you mean writes?

Since its really writes that I am having trouble.. the auto vacuum
message tells me 11 pages were removed and so many tuples were
removed.. I am guessing its writes.

I can try vacuuming that table before it starts the run to see it can
avoid that..

-Jignesh

Tom Lane wrote:
> "Jignesh K. Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM> writes:
>
>> So from the PostgreSQL view things are doing fine based on outputs: I
>> need to figure out the Solaris view on it now.
>>
>
>
>> Could it be related to autovacuum happening also?
>>
>
> Maybe ... have you tried fiddling with the vacuum_cost_delay options?
>
> Looking at the autovacuum log output,
>
>
>> 2007-11-13 09:21:19.830 PST 9458 LOG: automatic vacuum of table
>> "specdb.public.txn_log_table": index scans: 1
>> pages: 11 removed, 105 remain
>> tuples: 3147 removed, 40 remain
>> system usage: CPU 0.11s/0.09u sec elapsed 6.02 sec
>>
>
> it seems like a serious omission that this gives you no hint how many
> pages were scanned.
>
> regards, tom lane
>


From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date: 2007-11-15 09:36:21
Message-ID: 473C1315.6020006@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jignesh K. Shah wrote:
> Since its really writes that I am having trouble.. the auto vacuum
> message tells me 11 pages were removed and so many tuples were
> removed.. I am guessing its writes.

Do you keep track of I/O to WAL and data separately? WAL bandwidth will
spike up when a checkpoint starts, because of full page writes.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date: 2007-11-15 09:40:25
Message-ID: 473C1409.50500@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Looking at the autovacuum log output,
>
>> 2007-11-13 09:21:19.830 PST 9458 LOG: automatic vacuum of table
>> "specdb.public.txn_log_table": index scans: 1
>> pages: 11 removed, 105 remain
>> tuples: 3147 removed, 40 remain
>> system usage: CPU 0.11s/0.09u sec elapsed 6.02 sec
>
> it seems like a serious omission that this gives you no hint how many
> pages were scanned.

Isn't it pages removed + remain? 116 in this case.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>, <pgsql-hackers(at)postgresql(dot)org>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date: 2007-11-15 10:08:13
Message-ID: 87sl37x1ma.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:

> Tom Lane wrote:
>> Looking at the autovacuum log output,
>>
>>> 2007-11-13 09:21:19.830 PST 9458 LOG: automatic vacuum of table
>>> "specdb.public.txn_log_table": index scans: 1
>>> pages: 11 removed, 105 remain
>>> tuples: 3147 removed, 40 remain
>>> system usage: CPU 0.11s/0.09u sec elapsed 6.02 sec
>>
>> it seems like a serious omission that this gives you no hint how many
>> pages were scanned.
>
> Isn't it pages removed + remain? 116 in this case.

How do 40 tuples take 105 pages?

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's Slony Replication support!


From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date: 2007-11-15 11:46:02
Message-ID: 20071115114602.GA19014@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:

> > 2007-11-13 09:21:19.830 PST 9458 LOG: automatic vacuum of table
> > "specdb.public.txn_log_table": index scans: 1
> > pages: 11 removed, 105 remain
> > tuples: 3147 removed, 40 remain
> > system usage: CPU 0.11s/0.09u sec elapsed 6.02 sec
>
> it seems like a serious omission that this gives you no hint how many
> pages were scanned.

Hmm, right. I'm not sure how to fix it; the simplest idea is to count
the number of heap page accesses in lazy_scan_heap, but this wouldn't
count index pages so it wouldn't be real. (However, we already report
"index scans" so maybe this is not all that important).

Another, more complex idea would be to use the already existing
infrastructure for counting buffer accesses, as in ShowBufferUsage.
However, just calling ResetBufferUsage and then get the counts would
make the counters useless for the outer reporter (the callers in
postgres.c). We could have a separate set of "save" counters; so when
vacuum starts, save the current counters and reset them; do the vacuum,
report the counters; and finally, restore the save counters by adding
the current counters.

Is this too complex?

--
Alvaro Herrera http://www.amazon.com/gp/registry/CTMLCN8V17R4
"La tristeza es un muro entre dos jardines" (Khalil Gibran)


From: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date: 2007-11-15 14:03:09
Message-ID: 473C519D.9060805@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Yes I separate out as follows:
PGDATA + 1 TABLE which needs to be cached (also workaround CLOG read
problem)
LOGS
DATABASE TABLES
DATABASE INDEX
to get a good view of IOs out

I have full_page_writes=off in my settings
I dont see spikes of increase on WAL during checkpoints (maybe due to my
setting) but the constant load which is in the range of about
2-2.5MB/sec which is not low but my load is high.

In my current run I do have async wal on with wal_writer_delay=100ms and
commit_delay off.

-Jignesh

Heikki Linnakangas wrote:
> Jignesh K. Shah wrote:
>> Since its really writes that I am having trouble.. the auto vacuum
>> message tells me 11 pages were removed and so many tuples were
>> removed.. I am guessing its writes.
>
> Do you keep track of I/O to WAL and data separately? WAL bandwidth
> will spike up when a checkpoint starts, because of full page writes.
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date: 2007-11-15 15:03:44
Message-ID: 2182.1195139024@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> Tom Lane wrote:
>> it seems like a serious omission that this gives you no hint how many
>> pages were scanned.

> Hmm, right. I'm not sure how to fix it; the simplest idea is to count
> the number of heap page accesses in lazy_scan_heap, but this wouldn't
> count index pages so it wouldn't be real. (However, we already report
> "index scans" so maybe this is not all that important).

> Another, more complex idea would be to use the already existing
> infrastructure for counting buffer accesses, as in ShowBufferUsage.
> However, just calling ResetBufferUsage and then get the counts would
> make the counters useless for the outer reporter (the callers in
> postgres.c). We could have a separate set of "save" counters; so when
> vacuum starts, save the current counters and reset them; do the vacuum,
> report the counters; and finally, restore the save counters by adding
> the current counters.

> Is this too complex?

Too complex for my taste, anyway. I would be satisfied if the log
entries just indicated how big the table and indexes were. Heikki
pointed out that the message does tell you how many heap pages there
were, if you know to add removed + remain; but unless I'm still missing
something there's no way to tell how many index pages.

regards, tom lane


From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date: 2007-11-15 15:24:39
Message-ID: 20071115152439.GF19014@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> > Tom Lane wrote:
> >> it seems like a serious omission that this gives you no hint how many
> >> pages were scanned.
>
> Too complex for my taste, anyway. I would be satisfied if the log
> entries just indicated how big the table and indexes were. Heikki
> pointed out that the message does tell you how many heap pages there
> were, if you know to add removed + remain; but unless I'm still missing
> something there's no way to tell how many index pages.

Maybe we can save RelationGetNumberOfBlocks() at the start of vacuum for
heap and indexes, and report it as two numbers (total heap pages: X,
total index pages: Y).

We would still be missing the TOAST rel and index ...

--
Alvaro Herrera Developer, http://www.PostgreSQL.org/
"Cuando miro a alguien, más me atrae cómo cambia que quién es" (J. Binoche)


From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: LDC - Load Distributed Checkpoints with PG8.3b2 on Solaris
Date: 2007-11-19 13:58:38
Message-ID: 20071119135838.GC13505@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark wrote:
> "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
>
> > Tom Lane wrote:
> >> Looking at the autovacuum log output,
> >>
> >>> 2007-11-13 09:21:19.830 PST 9458 LOG: automatic vacuum of table
> >>> "specdb.public.txn_log_table": index scans: 1
> >>> pages: 11 removed, 105 remain
> >>> tuples: 3147 removed, 40 remain
> >>> system usage: CPU 0.11s/0.09u sec elapsed 6.02 sec
> >>
> >> it seems like a serious omission that this gives you no hint how many
> >> pages were scanned.
> >
> > Isn't it pages removed + remain? 116 in this case.
>
> How do 40 tuples take 105 pages?

The first 65 pages could be empty.

--
Alvaro Herrera http://www.PlanetPostgreSQL.org/
Bob [Floyd] used to say that he was planning to get a Ph.D. by the "green
stamp method," namely by saving envelopes addressed to him as 'Dr. Floyd'.
After collecting 500 such letters, he mused, a university somewhere in
Arizona would probably grant him a degree. (Don Knuth)