Re: (Mis)using the PostgreSQL planner to get estimated row counts

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Nick Johnson <arachnid(at)notdot(dot)net>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: (Mis)using the PostgreSQL planner to get estimated row counts
Date: 2005-05-31 16:00:13
Message-ID: 12893.1117555213@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Nick Johnson <arachnid(at)notdot(dot)net> writes:
> I'm trying to write a PostgreSQL extension to estimate the number of
> rows returned by a SELECT statement.

Instead of fooling around at the C level, why don't you just do an
EXPLAIN and parse out the first row of the result? For instance

regression=# create function estimate_row_count(text) returns text as $$
regression$# declare x record;
regression$# begin
regression$# for x in execute 'EXPLAIN ' || $1 loop
regression$# return substring(x."QUERY PLAN" from 'rows=([0-9]+) width=');
regression$# end loop;
regression$# end$$ language plpgsql strict;
CREATE FUNCTION
regression=# select estimate_row_count('select * from tenk1');
estimate_row_count
--------------------
10000
(1 row)

Of course this is subject to future breakage due to changes in the
output textual format, etc etc, but it's surely less fragile than
anything written in C will be.

Depending on what you want the results for, it might be best to ignore
any top-level LIMIT node.

regards, tom lane

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message lucas 2005-05-31 16:36:48 Sum() rows
Previous Message Mya Mason 2005-05-31 15:44:43 new antidote found