Re: count(*) slow on large tables

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Christopher Browne <cbbrowne(at)libertyrms(dot)info>, pgsql-performance(at)postgresql(dot)org
Subject: Re: count(*) slow on large tables
Date: 2003-10-04 17:48:47
Message-ID: 200310041748.h94Hmlc04602@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > We do have a TODO item:
> > * Consider using MVCC to cache count(*) queries with no WHERE clause
>
> > The idea is to cache a recent count of the table, then have
> > insert/delete add +/- records to the count. A COUNT(*) would get the
> > main cached record plus any visible +/- records. This would allow the
> > count to return the proper value depending on the visibility of the
> > requesting transaction, and it would require _no_ heap or index scan.
>
> ... and it would give the wrong answers. Unless the cache is somehow
> snapshot-aware, so that it can know which other transactions should be
> included in your count.

The cache is an ordinary table, with xid's on every row. I meant it
would require no index/heap scans of the large table --- it would still
require a scan of the "count" table.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruno Wolff III 2003-10-04 17:50:27 Re: pg_dump bug in 7.4
Previous Message Tom Lane 2003-10-04 17:29:23 Re: Beta4 Tag'd and Bundled ...

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2003-10-04 17:51:38 Re: count(*) slow on large tables
Previous Message Tom Lane 2003-10-04 16:49:33 Re: count(*) slow on large tables