Re: Use of indexes with table inheritance

From: Christopher Petrilli <petrilli(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: pgsql General List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Use of indexes with table inheritance
Date: 2005-02-07 23:08:40
Message-ID: 59d991c4050207150863d6b3b6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, 7 Feb 2005 19:34:52 -0300, Alvaro Herrera
<alvherre(at)dcc(dot)uchile(dot)cl> wrote:
> Any reason you don't use a single table? Also, please post the EXPLAIN
> ANALYZE of your query.

My main reason was experimental evidence showed a massive drop in
performance with 6 indexes (the number needed from a query
perspective) when hitting 2-3M rows given the insert rate that I had
to sustain. I'm using COPY to insert 2000 records/second at this
point. This is an application that currently runs on Oracle and uses
SQL*Loader to get data into the database. On the Oracle side we use
Data Partioning to manage the data sets under it. We're talking a
data set that can easily be 10B rows. Because of how Data
Partitioning works, you can easily remove "chunks" of data when it
needs to be archived. This is also why I'm using inheritance.

bench2=# explain analyze select count(event_id) from events where
src_ip = 3232235513;

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=261090.73..261090.73 rows=1 width=66) (actual
time=82950.432..82950.433 rows=1 loops=1)
-> Append (cost=0.00..260917.98 rows=69097 width=66) (actual
time=0.123..82854.881 rows=80171 loops=1)
-> Seq Scan on events (cost=0.00..0.00 rows=1 width=66)
(actual time=0.002..0.002 rows=0 loops=1)
Filter: (src_ip = 3232235513::bigint)
-> Index Scan using events001_src_ip_idx on events001 events
(cost=0.00..11323.63 rows=2925 width=36) (actual time=0.118..2008.688
rows=7885 loops=1)
Index Cond: (src_ip = 3232235513::bigint)
-> Index Scan using events002_src_ip_idx on events002 events
(cost=0.00..30803.25 rows=8001 width=36) (actual
time=12.137..17262.378 rows=7972 loops=1)
Index Cond: (src_ip = 3232235513::bigint)
-> Index Scan using events003_src_ip_idx on events003 events
(cost=0.00..11276.54 rows=2912 width=36) (actual
time=10.995..17387.986 rows=7993 loops=1)
Index Cond: (src_ip = 3232235513::bigint)
-> Index Scan using events004_src_ip_idx on events004 events
(cost=0.00..33378.92 rows=8667 width=36) (actual
time=13.344..17220.166 rows=8039 loops=1)
Index Cond: (src_ip = 3232235513::bigint)
-> Index Scan using events005_src_ip_idx on events005 events
(cost=0.00..29463.54 rows=7667 width=36) (actual
time=8.827..17220.986 rows=8131 loops=1)
Index Cond: (src_ip = 3232235513::bigint)
-> Seq Scan on events006 events (cost=0.00..35228.00
rows=9667 width=36) (actual time=7.823..3185.068 rows=8055 loops=1)
Filter: (src_ip = 3232235513::bigint)
-> Index Scan using events007_src_ip_idx on events007 events
(cost=0.00..11309.66 rows=2922 width=36) (actual time=0.115..1957.151
rows=8053 loops=1)
Index Cond: (src_ip = 3232235513::bigint)
-> Index Scan using events008_src_ip_idx on events008 events
(cost=0.00..30820.39 rows=8001 width=36) (actual time=0.107..1976.218
rows=8076 loops=1)
Index Cond: (src_ip = 3232235513::bigint)
-> Index Scan using events009_src_ip_idx on events009 events
(cost=0.00..32086.05 rows=8334 width=36) (actual time=0.117..2780.883
rows=7996 loops=1)
Index Cond: (src_ip = 3232235513::bigint)
-> Seq Scan on events010 events (cost=0.00..35228.00
rows=10000 width=36) (actual time=0.060..1716.214 rows=7971 loops=1)
Filter: (src_ip = 3232235513::bigint)
Total runtime: 82950.584 ms
(25 rows)

That's the second run of the same query, so that it can take advantage
of caching. The first run was 80% more expensive in runtime.

Chris
--
| Christopher Petrilli
| petrilli(at)gmail(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Devrim GUNDUZ 2005-02-07 23:37:37 [SECURITY] New sets of PostgreSQL RPMS are available for download
Previous Message Tim Vadnais 2005-02-07 23:08:12 a SELECT FOR UPDATE question