Re: PERFORMANCE and SIZE

Lists: pgsql-performance
From: Josh Berkus <josh(at)agliodbs(dot)com>
To: "Alfranio Junior" <alfranio(at)lsd(dot)di(dot)uminho(dot)pt>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: PERFORMANCE and SIZE
Date: 2003-05-12 15:49:39
Message-ID: 200305120849.39381.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Alfranio,

> I'm a new PostgresSql user and I do not know so much about the
> performance mechanisms currently implemented and available.
<snip>
> Does anybody know what is happening ?

90% likely: You haven't run VACUUM FULL ANALYZE in a while.

--
Josh Berkus
Aglio Database Solutions
San Francisco


From: "Alfranio Junior" <alfranio(at)lsd(dot)di(dot)uminho(dot)pt>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: PERFORMANCE and SIZE
Date: 2003-05-12 19:35:24
Message-ID: 011d01c318be$997217f0$9002a8c0@ialfranio
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hello,

I'm a new PostgresSql user and I do not know so much about the
performance mechanisms currently implemented and available.

So, as a dummy user I think that something strange is happening with me.
When I run the following command:

explain analyze select * from customer
where c_last = 'ROUGHTATION' and
c_w_id = 1 and
c_d_id = 1
order by c_w_id, c_d_id, c_last, c_first limit 1;

I receive the following results:

(Customer table with 60.000 rows) -
QUERY PLAN
---------------------------------------------------------------------------
-----------------------------------------------------------
Limit (cost=4.84..4.84 rows=1 width=283) (actual time=213.13..213.13
rows=0 loops=1)
-> Sort (cost=4.84..4.84 rows=1 width=283) (actual
time=213.13..213.13 rows=0 loops=1)
Sort Key: c_w_id, c_d_id, c_last, c_first
-> Index Scan using pk_customer on customer (cost=0.00..4.83
rows=1 width=283) (actual time=211.93..211.93 rows=0 loops=1)
Index Cond: ((c_w_id = 1) AND (c_d_id = 1))
Filter: (c_last = 'ROUGHTATION'::bpchar)
Total runtime: 213.29 msec
(7 rows)

(Customer table with 360.000 rows) -
QUERY PLAN
---------------------------------------------------------------------------
-------------------------------------------------------------
Limit (cost=11100.99..11101.00 rows=1 width=638) (actual
time=20.82..20.82 rows=0 loops=1)
-> Sort (cost=11100.99..11101.00 rows=4 width=638) (actual
time=20.81..20.81 rows=0 loops=1)
Sort Key: c_w_id, c_d_id, c_last, c_first
-> Index Scan using pk_customer on customer
(cost=0.00..11100.95 rows=4 width=638) (actual time=20.40..20.40 rows=0
loops=1)
Index Cond: ((c_w_id = 1) AND (c_d_id = 1))
Filter: (c_last = 'ROUGHTATION'::bpchar)
Total runtime: 21.11 msec
(7 rows)

Increasing the number of rows the total runtime decreases.
The customer table has the following structure:
CREATE TABLE customer
(
c_id int NOT NULL ,
c_d_id int4 NOT NULL ,
c_w_id int4 NOT NULL ,
c_first char (16) NULL ,
c_middle char (2) NULL ,
c_last char (16) NULL ,
c_street_1 char (20) NULL ,
c_street_2 char (20) NULL ,
c_city char (20) NULL ,
c_state char (2) NULL ,
c_zip char (9) NULL ,
c_phone char (16) NULL ,
c_since timestamp NULL ,
c_credit char (2) NULL ,
c_credit_lim numeric(12, 2) NULL ,
c_discount numeric(4, 4) NULL ,
c_balance numeric(12, 2) NULL ,
c_ytd_payment numeric(12, 2) NULL ,
c_payment_cnt int4 NULL ,
c_delivery_cnt int4 NULL ,
c_data text NULL
);

ALTER TABLE customer ADD
CONSTRAINT PK_customer PRIMARY KEY
(
c_w_id,
c_d_id,
c_id
);

Does anybody know what is happening ?

Thanks !!!!

Alfranio Junior


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: "Alfranio Junior" <alfranio(at)lsd(dot)di(dot)uminho(dot)pt>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: PERFORMANCE and SIZE
Date: 2003-05-13 16:15:34
Message-ID: 200305130915.34850.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Alfranio,

> And now, the optimizer started to use a table scan and in consequence gives
> me:

What appears to me to be happening is that the planner has incorrect estimates
of the cost of an index lookup. The base estimate is contained in the
postgresql.conf parameter:
cpu_index_tuple_cost = 0.001

From the look of things, your disk/array has much better random seek times
than the standard, or you have enough RAM to cache most of your tables.
Either way, I would experiment with lowering the index_tuple_cost to, say,
0.0003 and see if you get better use of indexes.

If that does work for you, make sure to check some other queries unrelated to
the "customers" table to make sure that the new setting doesn't mess them up
in some way.

--
Josh Berkus
Aglio Database Solutions
San Francisco


From: "Alfranio Junior" <alfranio(at)lsd(dot)di(dot)uminho(dot)pt>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: PERFORMANCE and SIZE
Date: 2003-05-13 20:28:58
Message-ID: 012a01c3198e$48957270$9002a8c0@ialfranio
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Josh,

I ran the vacuumdb as follows:
vacuumdb -f -v -e -a

and after that,

vacuumdb -z -v -e -a.

And now, the optimizer started to use a table scan and in consequence gives
me:

explain analyze select * from customer

where c_last = 'ROUGHTATION' and

c_w_id = 1 and

c_d_id = 1

order by c_w_id, c_d_id, c_last, c_first limit 1;

QUERY PLAN

----------------------------------------------------------------------------
-----------------------------------------

Limit (cost=6302.03..6302.03 rows=1 width=639) (actual time=208.33..208.33
rows=0 loops=1)

-> Sort (cost=6302.03..6302.04 rows=3 width=639) (actual time=208.32..208.32
rows=0 loops=1)

Sort Key: c_w_id, c_d_id, c_last, c_first

-> Seq Scan on customer (cost=0.00..6302.00 rows=3 width=639) (actual
time=207.99..207.99 rows=0 loops=1)

Filter: ((c_last = 'ROUGHTATION'::bpchar) AND (c_w_id = 1) AND (c_d_id = 1))

Total runtime: 208.54 msec

(6 rows)

When I force the index use a receive a better result:

set enable_seqscan to off;

explain analyze select * from customer

where c_last = 'ROUGHTATION' and

c_w_id = 1 and

c_d_id = 1

order by c_w_id, c_d_id, c_last, c_first limit 1;

QUERY PLAN

----------------------------------------------------------------------------
-----------------------------------------------------------

Limit (cost=9860.03..9860.03 rows=1 width=639) (actual time=13.98..13.98
rows=0 loops=1)

-> Sort (cost=9860.03..9860.04 rows=3 width=639) (actual time=13.98..13.98
rows=0 loops=1)

Sort Key: c_w_id, c_d_id, c_last, c_first

-> Index Scan using pk_customer on customer (cost=0.00..9860.00 rows=3
width=639) (actual time=13.86..13.86 rows=0 loops=1)

Index Cond: ((c_w_id = 1) AND (c_d_id = 1))

Filter: (c_last = 'ROUGHTATION'::bpchar)

Total runtime: 14.11 msec

(7 rows)

Is this the only way to force the index ?
What are the reasons to the optimizer to decide for a worse plan ?

> Alfranio,
>
> > I'm a new PostgresSql user and I do not know so much about the
> > performance mechanisms currently implemented and available.
> <snip>
> > Does anybody know what is happening ?
>
> 90% likely: You haven't run VACUUM FULL ANALYZE in a while.
>
> --
> Josh Berkus
> Aglio Database Solutions
> San Francisco
>


From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Alfranio Junior <alfranio(at)lsd(dot)di(dot)uminho(dot)pt>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: PERFORMANCE and SIZE
Date: 2003-05-13 20:51:37
Message-ID: Pine.LNX.4.33.0305131450320.29172-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Tue, 13 May 2003, Josh Berkus wrote:

> Alfranio,
>
> > And now, the optimizer started to use a table scan and in consequence gives
> > me:
>
> What appears to me to be happening is that the planner has incorrect estimates
> of the cost of an index lookup. The base estimate is contained in the
> postgresql.conf parameter:
> cpu_index_tuple_cost = 0.001
>
> >From the look of things, your disk/array has much better random seek times
> than the standard, or you have enough RAM to cache most of your tables.
> Either way, I would experiment with lowering the index_tuple_cost to, say,
> 0.0003 and see if you get better use of indexes.
>
> If that does work for you, make sure to check some other queries unrelated to
> the "customers" table to make sure that the new setting doesn't mess them up
> in some way.

Also, you can lower random page cost. And make sure the query planner has
some idea how much effective cache you have, as it can kind of take that
into account too. i.e. a machine wiht 800 Meg cache is far more likely to
have data in memory than one 100 MEg cache. This is kernel cache I'm
talking about, by the way. effective cache size is set in 8k blocks.


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Alfranio Junior <alfranio(at)lsd(dot)di(dot)uminho(dot)pt>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: PERFORMANCE and SIZE
Date: 2003-05-15 03:30:17
Message-ID: 200305150330.h4F3UHV03141@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance


I have gotten so much spam, this subject line struck me as spam until I
looked closer. Did it catch anyone else?

---------------------------------------------------------------------------

Alfranio Junior wrote:
> Hello,
>
> I'm a new PostgresSql user and I do not know so much about the
> performance mechanisms currently implemented and available.
>
> So, as a dummy user I think that something strange is happening with me.
> When I run the following command:
>
> explain analyze select * from customer
> where c_last = 'ROUGHTATION' and
> c_w_id = 1 and
> c_d_id = 1
> order by c_w_id, c_d_id, c_last, c_first limit 1;
>
> I receive the following results:
>
> (Customer table with 60.000 rows) -
> QUERY PLAN
> ---------------------------------------------------------------------------
> -----------------------------------------------------------
> Limit (cost=4.84..4.84 rows=1 width=283) (actual time=213.13..213.13
> rows=0 loops=1)
> -> Sort (cost=4.84..4.84 rows=1 width=283) (actual
> time=213.13..213.13 rows=0 loops=1)
> Sort Key: c_w_id, c_d_id, c_last, c_first
> -> Index Scan using pk_customer on customer (cost=0.00..4.83
> rows=1 width=283) (actual time=211.93..211.93 rows=0 loops=1)
> Index Cond: ((c_w_id = 1) AND (c_d_id = 1))
> Filter: (c_last = 'ROUGHTATION'::bpchar)
> Total runtime: 213.29 msec
> (7 rows)
>
>
> (Customer table with 360.000 rows) -
> QUERY PLAN
> ---------------------------------------------------------------------------
> -------------------------------------------------------------
> Limit (cost=11100.99..11101.00 rows=1 width=638) (actual
> time=20.82..20.82 rows=0 loops=1)
> -> Sort (cost=11100.99..11101.00 rows=4 width=638) (actual
> time=20.81..20.81 rows=0 loops=1)
> Sort Key: c_w_id, c_d_id, c_last, c_first
> -> Index Scan using pk_customer on customer
> (cost=0.00..11100.95 rows=4 width=638) (actual time=20.40..20.40 rows=0
> loops=1)
> Index Cond: ((c_w_id = 1) AND (c_d_id = 1))
> Filter: (c_last = 'ROUGHTATION'::bpchar)
> Total runtime: 21.11 msec
> (7 rows)
>
> Increasing the number of rows the total runtime decreases.
> The customer table has the following structure:
> CREATE TABLE customer
> (
> c_id int NOT NULL ,
> c_d_id int4 NOT NULL ,
> c_w_id int4 NOT NULL ,
> c_first char (16) NULL ,
> c_middle char (2) NULL ,
> c_last char (16) NULL ,
> c_street_1 char (20) NULL ,
> c_street_2 char (20) NULL ,
> c_city char (20) NULL ,
> c_state char (2) NULL ,
> c_zip char (9) NULL ,
> c_phone char (16) NULL ,
> c_since timestamp NULL ,
> c_credit char (2) NULL ,
> c_credit_lim numeric(12, 2) NULL ,
> c_discount numeric(4, 4) NULL ,
> c_balance numeric(12, 2) NULL ,
> c_ytd_payment numeric(12, 2) NULL ,
> c_payment_cnt int4 NULL ,
> c_delivery_cnt int4 NULL ,
> c_data text NULL
> );
>
> ALTER TABLE customer ADD
> CONSTRAINT PK_customer PRIMARY KEY
> (
> c_w_id,
> c_d_id,
> c_id
> );
>
> Does anybody know what is happening ?
>
>
> Thanks !!!!
>
> Alfranio Junior
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Rudi Starcevic <rudi(at)oasis(dot)net(dot)au>
To:
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: PERFORMANCE and SIZE
Date: 2003-05-15 03:46:05
Message-ID: 3EC30D7D.4070107@oasis.net.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Ha !

No - it didn't catch me -- but Yes my spam has been going through the
roof lately.
Over here in Australia it's in the Media alot of late - Spam increases.
Seems like everyone is suffering.

Cheers
RS.

Bruce Momjian wrote:

>I have gotten so much spam, this subject line struck me as spam until I
>looked closer. Did it catch anyone else?
>
>---------------------------------------------------------------------------
>
>Alfranio Junior wrote:
>
>
>> Hello,
>>
>>I'm a new PostgresSql user and I do not know so much about the
>> performance mechanisms currently implemented and available.
>>
>> So, as a dummy user I think that something strange is happening with me.
>> When I run the following command:
>>
>> explain analyze select * from customer
>> where c_last = 'ROUGHTATION' and
>> c_w_id = 1 and
>> c_d_id = 1
>> order by c_w_id, c_d_id, c_last, c_first limit 1;
>>
>> I receive the following results:
>>
>> (Customer table with 60.000 rows) -
>> QUERY PLAN
>> ---------------------------------------------------------------------------
>>-----------------------------------------------------------
>> Limit (cost=4.84..4.84 rows=1 width=283) (actual time=213.13..213.13
>> rows=0 loops=1)
>> -> Sort (cost=4.84..4.84 rows=1 width=283) (actual
>> time=213.13..213.13 rows=0 loops=1)
>> Sort Key: c_w_id, c_d_id, c_last, c_first
>> -> Index Scan using pk_customer on customer (cost=0.00..4.83
>> rows=1 width=283) (actual time=211.93..211.93 rows=0 loops=1)
>> Index Cond: ((c_w_id = 1) AND (c_d_id = 1))
>> Filter: (c_last = 'ROUGHTATION'::bpchar)
>> Total runtime: 213.29 msec
>> (7 rows)
>>
>>
>> (Customer table with 360.000 rows) -
>> QUERY PLAN
>> ---------------------------------------------------------------------------
>>-------------------------------------------------------------
>> Limit (cost=11100.99..11101.00 rows=1 width=638) (actual
>> time=20.82..20.82 rows=0 loops=1)
>> -> Sort (cost=11100.99..11101.00 rows=4 width=638) (actual
>> time=20.81..20.81 rows=0 loops=1)
>> Sort Key: c_w_id, c_d_id, c_last, c_first
>> -> Index Scan using pk_customer on customer
>> (cost=0.00..11100.95 rows=4 width=638) (actual time=20.40..20.40 rows=0
>> loops=1)
>> Index Cond: ((c_w_id = 1) AND (c_d_id = 1))
>> Filter: (c_last = 'ROUGHTATION'::bpchar)
>> Total runtime: 21.11 msec
>> (7 rows)
>>
>> Increasing the number of rows the total runtime decreases.
>> The customer table has the following structure:
>> CREATE TABLE customer
>> (
>> c_id int NOT NULL ,
>> c_d_id int4 NOT NULL ,
>> c_w_id int4 NOT NULL ,
>> c_first char (16) NULL ,
>> c_middle char (2) NULL ,
>> c_last char (16) NULL ,
>> c_street_1 char (20) NULL ,
>> c_street_2 char (20) NULL ,
>> c_city char (20) NULL ,
>> c_state char (2) NULL ,
>> c_zip char (9) NULL ,
>> c_phone char (16) NULL ,
>> c_since timestamp NULL ,
>> c_credit char (2) NULL ,
>> c_credit_lim numeric(12, 2) NULL ,
>> c_discount numeric(4, 4) NULL ,
>> c_balance numeric(12, 2) NULL ,
>> c_ytd_payment numeric(12, 2) NULL ,
>> c_payment_cnt int4 NULL ,
>> c_delivery_cnt int4 NULL ,
>> c_data text NULL
>> );
>>
>> ALTER TABLE customer ADD
>> CONSTRAINT PK_customer PRIMARY KEY
>> (
>> c_w_id,
>> c_d_id,
>> c_id
>> );
>>
>> Does anybody know what is happening ?
>>
>>
>> Thanks !!!!
>>
>> Alfranio Junior
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 5: Have you checked our extensive FAQ?
>>
>>http://www.postgresql.org/docs/faqs/FAQ.html
>>
>>
>>
>
>
>