Re: Critical performance problems on large databases

Lists: pgsql-general
From: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Critical performance problems on large databases
Date: 2002-04-10 22:39:37
Message-ID: 3CB4BF29.1000700@aurora.regenstrief.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi,

it had been repeatedly noted that pgsql was sort of slow on
selects on large tables. For instance, I know a colleague of
mine has asked this recently but on the Google archives I
found the following pertient question and response.

http://archives.postgresql.org/pgsql-sql/2000-03/msg00031.php

Basically if you have a large table with, say, a million rows
and you do

SELECT * FROM Bigtable;

it takes a long time for it to come up with the first page
of results using up lots of computer resources etc and after
the first page is returned the backend basically goes into
idle mode for all the rest of the query results retrieval.

We also noted that a

SELECT COUNT(*) FROM BigQuery;

can take quite a long time and again use a lot of resources,
whereas

SELECT COUNT(smallcolumn) FROM BigQuery;

may be faster and less resource consuming.

Off the bat, this indicates to me that there is something
sub-obtimal about PostgreSQL handling simple queries. From
a database that should perform well in online user transactions
one would want the query processing to be streamed as much
as possible, i.e., since in a SELECT * FROM Bigtable; there is
no work needed other than to retrieve the tuples out of
physical storage, the response should be immediate and resource
usage low. There should not be large buffer allocations.

Conversely it looks as if PostgreSQL will always read a sizeable
piece (if not all?) of the result set into some buffer area before
returning a single row. This would explain the slow startup on
the SELECT * FROM Bigtable; query as well as the fact that
COUNT(smallcolumn) behaves much faster than COUNT(*).

Again, count should be streamed as well such as to use no
significant memory resources other than the counter. Apparently
a COUNT(*) in postgres is executed as

SELECT * FROM Bigtable INTO $somebuffer
COUNT(tuples in $somebuffer)

Admittedly I am conjecturing here, but the evidence is strong.
Especially because I can make a 3 level subquery with group-
by and all kinds of stuff go almost faster than a SELECT *
FROM Bigtable;

Any enlightenments? Am I wrong? Will this be fixed soon?
Is it hard to change pgsql to do better streaming of its
operations. As a corollary, I would presume that pgsql
won't benefit much from multi-CPU machines because it cannot
parallelize its activities. It may be less hard to make all
of pgsql thread-safe and threaded than it is to rewrite the
execution engine to stream between the different task
and do less buffering, right?

Or may be all is fine with pgsql and we just didn't figure
out how to set up the configuration right?

thanks
-Gunther

PS: we are seriously looking into using pgsql as the core
of a BIG medical record system, but we already know that
if we can't get quick online responses (< 2 s) on
large rasult sets (10000 records) at least at the first
page (~ 100 records) we are in trouble.

--
Gunther Schadow, M.D., Ph.D. gschadow(at)regenstrief(dot)org
Medical Information Scientist Regenstrief Institute for Health Care
Adjunct Assistant Professor Indiana University School of Medicine
tel:1(317)630-7960 http://aurora.regenstrief.org


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-10 23:02:13
Message-ID: 23861.1018479733@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org> writes:
> We also noted that a
> SELECT COUNT(*) FROM BigQuery;
> can take quite a long time and again use a lot of resources,
> whereas
> SELECT COUNT(smallcolumn) FROM BigQuery;
> may be faster and less resource consuming.

This is complete nonsense... if anything, the second one will take
more cycles, since it has to actually examine a column.

As for the other thing, use a cursor and "fetch" a few rows at a time
if you want to overlap frontend and backend processing.

regards, tom lane


From: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Critical performance problems on large databases
Date: 2002-04-10 23:17:31
Message-ID: 20020410154437.W20046-100000@megazone23.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, 10 Apr 2002, Gunther Schadow wrote:

> Off the bat, this indicates to me that there is something
> sub-obtimal about PostgreSQL handling simple queries. From
> a database that should perform well in online user transactions
> one would want the query processing to be streamed as much
> as possible, i.e., since in a SELECT * FROM Bigtable; there is
> no work needed other than to retrieve the tuples out of
> physical storage, the response should be immediate and resource
> usage low. There should not be large buffer allocations.
>
> Conversely it looks as if PostgreSQL will always read a sizeable
> piece (if not all?) of the result set into some buffer area before
> returning a single row. This would explain the slow startup on
> the SELECT * FROM Bigtable; query as well as the fact that
> COUNT(smallcolumn) behaves much faster than COUNT(*).

IIRC, the entire result set is sent across in the select
* from bigtable case, possibly to allow random access to
the result set? Not sure really.

The usual way to deal with these cases is to use limit/offset
or a cursor to fetch pieces of the data as you want them (ie:
begin;
DECLARE foo CURSOR FOR SELECT * FROM Bigtable;
FETCH 100 from foo;
FETCH 100 from foo;
...
end;
)

> Again, count should be streamed as well such as to use no
> significant memory resources other than the counter. Apparently
> a COUNT(*) in postgres is executed as
>
> SELECT * FROM Bigtable INTO $somebuffer
> COUNT(tuples in $somebuffer)

I believe the thing that takes a long time here is that it has to do a
sequential scan of Bigtable, which for large tables is rather time
consuming. I generally haven't seen large growth of backends
on moderate sized tables for such queries, although due to the
sequential scan I try to avoid count() whenever possible.


From: "Nigel J(dot) Andrews" <nandrews(at)investsystems(dot)co(dot)uk>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 00:20:28
Message-ID: Pine.LNX.4.21.0204110040270.2690-100000@ponder.fairway2k.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


On Wed, 10 Apr 2002, Tom Lane wrote:

> Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org> writes:
> > We also noted that a
> > SELECT COUNT(*) FROM BigQuery;
> > can take quite a long time and again use a lot of resources,
> > whereas
> > SELECT COUNT(smallcolumn) FROM BigQuery;
> > may be faster and less resource consuming.
>
> This is complete nonsense... if anything, the second one will take
> more cycles, since it has to actually examine a column.
>

OK, I hate to do this, partly because I seem to remember mention of how this
sort of thing has to happen like this because the system isn't clever enough to
do the optimisation, but...

Shouldn't
SELECT count(*) FROM chat_post
give an immediate answer, especially when there's an index it can use?

Example, with what appears to be an overkill but the the primary key is a two
column affair where as the 'time' column counted in the second select has it's
own index:

avid_chat_archive=> SET ENABLE_SEQSCAN TO FALSE;
SET VARIABLE
avid_chat_archive=> EXPLAIN ANALYZE SELECT COUNT(*) FROM chat_post;
NOTICE: QUERY PLAN:

Aggregate (cost=100020853.10..100020853.10 rows=1 width=0) (actual time=48557.4
8..48557.49 rows=1 loops=1)
-> Seq Scan on chat_post (cost=100000000.00..100018291.08 rows=1024808 width
=0) (actual time=6.68..32598.56 rows=1024808 loops=1)
Total runtime: 48557.93 msec

EXPLAIN
avid_chat_archive=> SELECT COUNT(*) FROM chat_post;
count
---------
1024808
(1 row)

avid_chat_archive=> EXPLAIN ANALYZE SELECT COUNT(time) FROM chat_post;
NOTICE: QUERY PLAN:

Aggregate (cost=100020853.10..100020853.10 rows=1 width=8) (actual time=51314.5
4..51314.55 rows=1 loops=1)
-> Seq Scan on chat_post (cost=100000000.00..100018291.08 rows=1024808 width
=8) (actual time=6.78..35012.81 rows=1024808 loops=1)
Total runtime: 51314.97 msec

EXPLAIN
avid_chat_archive=> SELECT COUNT(time) FROM chat_post;
count
---------
1024808
(1 row)

avid_chat_archive=> \d chat_post

Table "chat_post"
Column | Type | Modifiers
-------------+--------------------------+-----------
session_id | smallint | not null
poster_name | character varying(32) | not null
time | timestamp with time zone | not null
post_number | smallint | not null
Indexes: chat_post_time_key,
chat_post_user_key
Primary key: chat_post_pkey
Triggers: RI_ConstraintTrigger_4369014,
RI_ConstraintTrigger_4369012,
RI_ConstraintTrigger_4369010,
RI_ConstraintTrigger_4369008,
RI_ConstraintTrigger_4369006

avid_chat_archive=>

Having muttered about the primary key using two columns I see the planner can
see the table size without having to revert to an index. Which makes sense if
only I'd turned my brain on first.

Anyway, the question still stands, why does postgres do this query this
way? It is doing the full sequential scan, i.e. fetching the tuples from
disk, when this data is not necessary for the query result. Is it to do with
calling requirement of count(), other aggregate functions and/or functions in
general when used in the return list and/or that it requires too much
intelligence for the system to determine such optimisations?

And, I can't see this specific item addressed in the FAQ. I'm sure I'm not the
only to have brought this up (oh wait, I'm reply to a related message so
obviously not). Shouldn't it be there?

--
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants


From: "Nigel J(dot) Andrews" <nandrews(at)investsystems(dot)co(dot)uk>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 00:26:25
Message-ID: Pine.LNX.4.21.0204110124460.2690-100000@ponder.fairway2k.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


On Thu, 11 Apr 2002, Nigel J. Andrews wrote:
>
> On Wed, 10 Apr 2002, Tom Lane wrote:
>
> > Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org> writes:
> > > We also noted that a
> > > SELECT COUNT(*) FROM BigQuery;
> > > can take quite a long time and again use a lot of resources,
> > > whereas
> > > SELECT COUNT(smallcolumn) FROM BigQuery;
> > > may be faster and less resource consuming.
> >
> > This is complete nonsense... if anything, the second one will take
> > more cycles, since it has to actually examine a column.
> >
>
> OK, I hate to do this, partly because I seem to remember mention of how this
> sort of thing has to happen like this because the system isn't clever enough to
> do the optimisation, but...
>
> [sniped]
>

Forgot to say this is on 7.2, I'm currently having some odd difficulty running
the regression tests for 7.2.1

--
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants


From: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To: "Nigel J(dot) Andrews" <nandrews(at)investsystems(dot)co(dot)uk>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 00:41:08
Message-ID: 20020410173843.U22695-100000@megazone23.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


> Having muttered about the primary key using two columns I see the planner can
> see the table size without having to revert to an index. Which makes sense if
> only I'd turned my brain on first.

The planner only gets the estimate from the last analyze/vacuum.

> Anyway, the question still stands, why does postgres do this query this
> way? It is doing the full sequential scan, i.e. fetching the tuples from
> disk, when this data is not necessary for the query result. Is it to do with
> calling requirement of count(), other aggregate functions and/or functions in
> general when used in the return list and/or that it requires too much
> intelligence for the system to determine such optimisations?

The reason is that it needs to visit those rows anyway to see whether they
are visible to your transaction. If you're visiting every row, sequential
scan is faster than index scan. If the index had the visibility
information this could be done purely by index, but there are problems
with doing that as well (I don't know them, but they're in messages on
this topic from the past).


From: "P(dot)J(dot) \"Josh\" Rovero" <rovero(at)sonalysts(dot)com>
To: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
Cc: "Nigel J(dot) Andrews" <nandrews(at)investsystems(dot)co(dot)uk>, pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 01:41:47
Message-ID: 3CB4E9DB.3030006@sonalysts.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

I can return result sets of 4K records from a database with 2M records
in tenths of a second. Smaller sets of 100 to 800 records take .02 to
0.1 second. This is on linux, pentium III at 500 MHz, pg 7.2.1.

Finding the right indexes made a *big* difference. Index using
the same criteria you use for select.

--
P. J. "Josh" Rovero Sonalysts, Inc.
Email: rovero(at)sonalysts(dot)com www.sonalysts.com 215 Parkway North
Work: (860)326-3671 or 442-4355 Waterford CT 06385
***********************************************************************


From: Bill Gribble <grib(at)linuxdevel(dot)com>
To: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 12:01:54
Message-ID: 1018526515.29603.34.camel@flophouse
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, 2002-04-10 at 17:39, Gunther Schadow wrote:
> PS: we are seriously looking into using pgsql as the core
> of a BIG medical record system, but we already know that
> if we can't get quick online responses (< 2 s) on
> large rasult sets (10000 records) at least at the first
> page (~ 100 records) we are in trouble.

There are a few tricks to getting fast results for pages of data in
large tables. I have an application in which we have a scrolling window
displaying data from a million-row table, and I have been able to make
it fairly interactively responsive (enough that it's not a problem).

We grab pages of a few screenfuls of data at a time using LIMIT /
OFFSET, enough to scroll smoothly over a short range. For LIMIT /
OFFSET queries to be fast, I found it was necessary to CREATE INDEX,
CLUSTER and ORDER BY the key field.

Then the biggest slowdown is count(*), which we have to do in order to
fake up the scrollbar (so we know what proportion of the data has been
scrolled through). I have not completely foxed this yet. I want to
keep a separate mini-table of how many records are in the big table and
update it with a trigger (the table is mostly static). ATM, I just try
hard to minimize the times I call count(*).

b.g.


From: "Nigel J(dot) Andrews" <nandrews(at)investsystems(dot)co(dot)uk>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 13:05:07
Message-ID: Pine.LNX.4.21.0204111346170.2690-100000@ponder.fairway2k.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 11 Apr 2002, Bill Gribble wrote:

> On Wed, 2002-04-10 at 17:39, Gunther Schadow wrote:
> > PS: we are seriously looking into using pgsql as the core
> > of a BIG medical record system, but we already know that
> > if we can't get quick online responses (< 2 s) on
> > large rasult sets (10000 records) at least at the first
> > page (~ 100 records) we are in trouble.
>
> There are a few tricks to getting fast results for pages of data in
> large tables. I have an application in which we have a scrolling window
> displaying data from a million-row table, and I have been able to make
> it fairly interactively responsive (enough that it's not a problem).
>
> We grab pages of a few screenfuls of data at a time using LIMIT /
> OFFSET, enough to scroll smoothly over a short range. For LIMIT /
> OFFSET queries to be fast, I found it was necessary to CREATE INDEX,
> CLUSTER and ORDER BY the key field.

I took a different approach in what I did. I considered selecting the requested
page of data only. However, I considered the fact that as the query had been
requested it was likely that not just this one page would be viewed by the
user. On top of that I always use a chronological time order for results which
I could probably do without explicitly specifying in the query if I reloaded
the data in time order to sort out the few blocks or rows that are out of
sequence it isn't exactly correct to rely on the 'natural' order is it. This
normal usage means that I very rarely will avoid a long running query on this
large table and since my web server doesn't transmit the page data until the
CGI-BIN process has finished I decided that I would utilise all the required
multiple page data from the long running query on it's first request by caching
the entire set of pages for it.

I can do this becuase of the nature of the updates to the data. Updates are in
batches at known times and so the cached pages for presentation can be held and
then expired at the appropiate times.

This doesn't give the quick response time Gunther is looking for but I think it
is an alternative for some situations.

> Then the biggest slowdown is count(*), which we have to do in order to
> fake up the scrollbar (so we know what proportion of the data has been
> scrolled through). I have not completely foxed this yet. I want to
> keep a separate mini-table of how many records are in the big table and
> update it with a trigger (the table is mostly static). ATM, I just try
> hard to minimize the times I call count(*).

Now this idea I like. I do count some things in the big table and again despite
there being an index on the column used in the GROUP BY it does a
seqscan. However, I can see that adding triggers to insert etc. in a table and
maintain counts is going to hit the data loading time but is going to speed the
count accesses tremendously.

--
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants


From: Shaun Thomas <sthomas(at)townnews(dot)com>
To: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 13:33:59
Message-ID: Pine.LNX.4.44.0204110759120.6628-100000@hamster.lee.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, 10 Apr 2002, Gunther Schadow wrote:

> SELECT * FROM Bigtable;
>
> it takes a long time for it to come up with the first page
> of results using up lots of computer resources etc and after
> the first page is returned the backend basically goes into
> idle mode for all the rest of the query results retrieval.

Very simple. PHP and PERL DBI, when doing non-cursor queries, will try
and buffer the *entire* result into memory before letting you have it.
In PHP, when you run a pg_exec, or mysql_query, or whatever, it actually
executes the query, and stores all of the results in memory. Don't
believe me? Try closing the database connection, and then do a fetch on
the result identifier. Heck, fetch until you reach the end.

You'll get all of your results. Even though the database is closed.

So when you are doing a "SELECT * FROM bigtable", you're telling PHP to
not only buffer over a million rows, but to transfer it from the
database as well. "SELECT COUNT(*) FROM bigtable" doesn't have
this overhead. The database just returns a single row, which you
can view basically as soon as the database is done.

There's also something to be said for the LIMIT clause. Somehow I
doubt you need every single row in the entire table in order to do
your work on that page. Just adding LIMIT/OFFSET will increase your
speed significantly.

Or, as someone else mentioned, use a cursor, and let the database buffer
your query. It'll be slow too, but you can at least fetch the rows
individually and get the perception of speed in your application.

> no work needed other than to retrieve the tuples out of
> physical storage, the response should be immediate and resource
> usage low. There should not be large buffer allocations.

I see how you can get into this kind of thinking. Since select * has no
ordering, no calculations, nothing but just returning raw results from
the table in whatever format they may be in, how could it possibly be
resource intensive? But the logic is wrong. If you want all the data,
it'll give it to you. Whether you do it at the console, and it has to
throw everything at the screen buffer, or your application, which has
to put it in some temporary storage until it gets what it needs and
deallocates the memory. That data has to go *somewhere*, it can't just
exist in limbo until you decide you want to use some of it.

You can use cursors to get around this, because the database basically
runs the query and doesn't send back squat until you actually ask it to.
But I'll tell you a well selected limit clause will work almost as well,
and reduce the need for the database to maintain a cursor.

Say you're on page 5, and you are showing 20 results per page. Knowing
that results start at 0, you can get the offset just by doing:

(PAGE - 1) * PER_PAGE = (5-1) * 20 = 4*20 = 80. So your query becomes:

SELECT * FROM Bigtable LIMIT 20 OFFSET 80.

And viola. You'll get back 20 rows right where you want them. Also,
take care of what you select from the table. Maybe you don't actually
need all of the data, but only certain rows. The less data that has to
be transferred from database to application, the faster you can get/show
your data. Especially if your database is on a separate machine from
your application server. Network transfers are *not* instantaneous.

Try:

SELECT col1,col2,col3,etc FROM Bigtable LIMIT x OFFSET y

instead. You'll save yourself some time, save the database the effort
of loading all million rows of every column in the table into the
network interface, and save your application the need to fetch them.

> COUNT(smallcolumn) behaves much faster than COUNT(*).

Or you can do what every database optimization book in the entire
universe says, and do COUNT(1). Since all you're testing is the
existence of the row, not any value of anything in it.

> Again, count should be streamed as well such as to use no
> significant memory resources other than the counter.

Actually, it's converted into: "*, eh? I think not. Let me rewrite
this query and put a 1 here instead."

The database tries to keep people from shooting themselves in the foot,
but even the most Herculean effort will fail when the user is bound and
determined to blow off their big toe.

> Any enlightenments? Am I wrong? Will this be fixed soon?
> Is it hard to change pgsql to do better streaming of its
> operations.

Read a few books on database design and optimization, the basics of
application memory allocation, and the TCP/IP stack. If you want to
understand how to make an application fast from front to back, you have
to understand the components that make it work. Your knowledge of
application memory performance and network latency seems inherently
flawed, and until you get over the assumption that network transfers are
free and optimizing queries is a fool's errand, you'll continue to have
problems in any database you choose.

--
+-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+
| Shaun M. Thomas INN Database Administrator |
| Phone: (309) 743-0812 Fax : (309) 743-0830 |
| Email: sthomas(at)townnews(dot)com AIM : trifthen |
| Web : www.townnews.com |
| |
| "Most of our lives are about proving something, either to |
| ourselves or to someone else." |
| -- Anonymous |
+-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-+


From: Andrew Sullivan <andrew(at)libertyrms(dot)info>
To: PostgreSQL general list <pgsql-general(at)postgresql(dot)org>
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 14:26:41
Message-ID: 20020411102641.E19037@mail.libertyrms.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, Apr 11, 2002 at 02:05:07PM +0100, Nigel J. Andrews wrote:

> On 11 Apr 2002, Bill Gribble wrote:

> > Then the biggest slowdown is count(*), which we have to do in order to
> > fake up the scrollbar (so we know what proportion of the data has been
> > scrolled through).

> seqscan. However, I can see that adding triggers to insert etc. in
> a table and maintain counts is going to hit the data loading time
> but is going to speed the count accesses tremendously.

I suspect the trigger would be pretty miserable for inserts. But why
use a trigger? If you need just pretty-close results, you could have
a process that runs (say) every 10 minutes which updated a table of
stats. (Naturally, if you need really accurate numbers, that's no
help. But If the idea is just a "record _r_ of _nnnn_ results" sort
of message, you can do what large databases have been doing for
years: if a query returns more than some small-ish number of records,
look in the stats table. Then you say "record _r_ of approximately
_nnnn_ results".

A

--
----
Andrew Sullivan 87 Mowat Avenue
Liberty RMS Toronto, Ontario Canada
<andrew(at)libertyrms(dot)info> M6K 3E3
+1 416 646 3304 x110


From: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>
To: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 16:09:05
Message-ID: 3CB5B521.4040909@aurora.regenstrief.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi,

thanks everyone who responded. I want to set this whole COUNT(...)
issue aside for a moment because I have reported hearsay instead
of my own experiences. I apologize for muddying the waters with
that COUNT issue. But I still want to respond to the SELECT *
FROM Bigtable; issue once more.

[[[I have to say I am a bit confused by some of the responses who
basically shrugged the problem off sending the asker from anywhere
between back to the schoolbooks to the mental facility. That's not
necessary.]]]

I am delighted to hear that one respondent has no problem with a
2M table and snappy response, but unfortunately he didn't say much
about the detail, was that really select * from bigtable; queries
or do we have where clauses and stuff that limits the result set
considerably? A number of respondents confirmed my observations,
so I think the problem is real.

There was one remark about Perl or PHP always loading the complete
result set before returning. Bad for them. I don't use either and
I think it's just bad design to do that on the client but I don't
care about bad clients. I care about a good server.

The constructive responses suggested that I use LIMIT/OFFSET and
CURSORs. I can see how that could be a workaround the problem, but
I still believe that something is wrong with the PostgreSQL query
executer. Loading the entire result set into a buffer without
need just makes no sense. Good data base engines try to provide
for parallel execution of the query plan as much as possible, and
that implies streaming. There's a crowd of literature about this
testifying for it's importance.

The main reasons for this is (a) the use on multi-processor machines
where one CPU does one task and the other does another task on the
same query plan and the results from CPU 1 is streamed to CPU 2 (hey,
I have a 6 processor machine in my basement.) Perhaps more importantly
(b) buffering (without need) is inherently bad, because it wastes
memory resources leads to bursty demand on CPU and network, and slow
perceived response times. Buffering is a complete waste if the buffer
is being paged out to disk again and it isn't flexible or scaleable
if buffer pages are fixed into physical memory. Straming is
especially important if you want to do distributed joins (and though
pgsql doesn't support that yet it would be foolish to close your eyes
before a fundamental problem and then being forced to rework this in
a hurry when the time comes for distributed PostgreSQL.)

So, while my client application might benefit from such things as
cursors and OFFSET/LIMIT, the query planning and executing may
suffer from the buffering. And of course, the point is that it makes
sense to design the server such that streaming results to the
client is transparent because it automatically relieves the strain
on all resources, CPU, storage and network! Isn't that obvious?

regards
-Gunther

--
Gunther Schadow, M.D., Ph.D. gschadow(at)regenstrief(dot)org
Medical Information Scientist Regenstrief Institute for Health Care
Adjunct Assistant Professor Indiana University School of Medicine
tel:1(317)630-7960 http://aurora.regenstrief.org


From: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 16:54:22
Message-ID: 20020411094638.E34142-100000@megazone23.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


> The constructive responses suggested that I use LIMIT/OFFSET and
> CURSORs. I can see how that could be a workaround the problem, but
> I still believe that something is wrong with the PostgreSQL query
> executer. Loading the entire result set into a buffer without
> need just makes no sense. Good data base engines try to provide
> for parallel execution of the query plan as much as possible, and
> that implies streaming. There's a crowd of literature about this
> testifying for it's importance.
>
> The main reasons for this is (a) the use on multi-processor machines
> where one CPU does one task and the other does another task on the
> same query plan and the results from CPU 1 is streamed to CPU 2 (hey,
> I have a 6 processor machine in my basement.) Perhaps more importantly
> (b) buffering (without need) is inherently bad, because it wastes
> memory resources leads to bursty demand on CPU and network, and slow
> perceived response times. Buffering is a complete waste if the buffer
> is being paged out to disk again and it isn't flexible or scaleable
> if buffer pages are fixed into physical memory. Straming is
> especially important if you want to do distributed joins (and though
> pgsql doesn't support that yet it would be foolish to close your eyes
> before a fundamental problem and then being forced to rework this in
> a hurry when the time comes for distributed PostgreSQL.)
>
> So, while my client application might benefit from such things as
> cursors and OFFSET/LIMIT, the query planning and executing may
> suffer from the buffering. And of course, the point is that it makes
> sense to design the server such that streaming results to the
> client is transparent because it automatically relieves the strain
> on all resources, CPU, storage and network! Isn't that obvious?

My limited understanding of the internals is that except for steps
that want their entire data set for providing output (sort) and
the final send everything to the client, internally each node
asks other nodes for rows as they need them and those nodes can
provide results on an as needed basis without necessarily buffering
their result set.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 16:56:57
Message-ID: 2187.1018544217@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org> writes:
> The constructive responses suggested that I use LIMIT/OFFSET and
> CURSORs. I can see how that could be a workaround the problem, but
> I still believe that something is wrong with the PostgreSQL query
> executer. Loading the entire result set into a buffer without
> need just makes no sense.

The Postgres backend does not do that. Most of the frontend client-side
libraries do, but feel free to write one that does not.

Offhand I think the only really serious downside to letting the
application code process the result in a streaming fashion is that the
application would have to be prepared to undo whatever it's done so far,
if it gets an error report partway through the result set. An example:
SELECT 1/x FROM foo;
where foo.x contains zeroes here and there. You'll get some rows out
before the query is abandoned with a divide-by-zero error. By
accumulating the result set, the existing libraries are able to offer a
cleaner yes-it-succeeded or no-it-didn't API. But this is surely not a
fatal objection, just a necessary piece of a less-clean streaming API.

> [snip]
> And of course, the point is that it makes
> sense to design the server such that streaming results to the
> client is transparent because it automatically relieves the strain
> on all resources, CPU, storage and network! Isn't that obvious?

The reason you got brush-off responses before was that you went into
lecture mode when you clearly hadn't spent any effort studying Postgres
internals. You're still doing that...

regards, tom lane


From: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 17:17:21
Message-ID: 3CB5C521.4000100@aurora.regenstrief.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Tom Lane wrote:

> Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org> writes:
>
>>The constructive responses suggested that I use LIMIT/OFFSET and
>>CURSORs. I can see how that could be a workaround the problem, but
>>I still believe that something is wrong with the PostgreSQL query
>>executer. Loading the entire result set into a buffer without
>>need just makes no sense.
>>
>
> The Postgres backend does not do that. Most of the frontend client-side
> libraries do, but feel free to write one that does not.

Thanks, Tom, that's helpful! So, are you saying the the psql
client does all the buffering and not the server? Is the
buffering mandatory when using the libpq API? When you say
"most frontend client side libraries" do you know of any
exceptions who don't do that? Is the odbc client doing this
too?

I can see myself fixing this problem, but would like to know
if there is something else already?

thanks for the info,
-Gunther

--
Gunther Schadow, M.D., Ph.D. gschadow(at)regenstrief(dot)org
Medical Information Scientist Regenstrief Institute for Health Care
Adjunct Assistant Professor Indiana University School of Medicine
tel:1(317)630-7960 http://aurora.regenstrief.org


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Shaun Thomas <sthomas(at)townnews(dot)com>
Cc: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 17:51:16
Message-ID: Pine.GSO.4.44.0204112045140.9764-100000@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

The big issue with LIMIT,OFFSET is that it still use all rows
for sorting. I already suggested to use partial sorting to avoid
sorting all rows if one selected only first 20 row, for example.
It's very important for Web applications because web users usually
read first 1-2 pages. Our experimnets have shown 6 times performance
win when using partial sorting.

Oleg
On Thu, 11 Apr 2002, Shaun Thomas wrote:

> On Wed, 10 Apr 2002, Gunther Schadow wrote:
>
> > SELECT * FROM Bigtable;
> >
> > it takes a long time for it to come up with the first page
> > of results using up lots of computer resources etc and after
> > the first page is returned the backend basically goes into
> > idle mode for all the rest of the query results retrieval.
>
> Very simple. PHP and PERL DBI, when doing non-cursor queries, will try
> and buffer the *entire* result into memory before letting you have it.
> In PHP, when you run a pg_exec, or mysql_query, or whatever, it actually
> executes the query, and stores all of the results in memory. Don't
> believe me? Try closing the database connection, and then do a fetch on
> the result identifier. Heck, fetch until you reach the end.
>
> You'll get all of your results. Even though the database is closed.
>
> So when you are doing a "SELECT * FROM bigtable", you're telling PHP to
> not only buffer over a million rows, but to transfer it from the
> database as well. "SELECT COUNT(*) FROM bigtable" doesn't have
> this overhead. The database just returns a single row, which you
> can view basically as soon as the database is done.
>
> There's also something to be said for the LIMIT clause. Somehow I
> doubt you need every single row in the entire table in order to do
> your work on that page. Just adding LIMIT/OFFSET will increase your
> speed significantly.
>
> Or, as someone else mentioned, use a cursor, and let the database buffer
> your query. It'll be slow too, but you can at least fetch the rows
> individually and get the perception of speed in your application.
>
> > no work needed other than to retrieve the tuples out of
> > physical storage, the response should be immediate and resource
> > usage low. There should not be large buffer allocations.
>
> I see how you can get into this kind of thinking. Since select * has no
> ordering, no calculations, nothing but just returning raw results from
> the table in whatever format they may be in, how could it possibly be
> resource intensive? But the logic is wrong. If you want all the data,
> it'll give it to you. Whether you do it at the console, and it has to
> throw everything at the screen buffer, or your application, which has
> to put it in some temporary storage until it gets what it needs and
> deallocates the memory. That data has to go *somewhere*, it can't just
> exist in limbo until you decide you want to use some of it.
>
> You can use cursors to get around this, because the database basically
> runs the query and doesn't send back squat until you actually ask it to.
> But I'll tell you a well selected limit clause will work almost as well,
> and reduce the need for the database to maintain a cursor.
>
> Say you're on page 5, and you are showing 20 results per page. Knowing
> that results start at 0, you can get the offset just by doing:
>
> (PAGE - 1) * PER_PAGE = (5-1) * 20 = 4*20 = 80. So your query becomes:
>
> SELECT * FROM Bigtable LIMIT 20 OFFSET 80.
>
> And viola. You'll get back 20 rows right where you want them. Also,
> take care of what you select from the table. Maybe you don't actually
> need all of the data, but only certain rows. The less data that has to
> be transferred from database to application, the faster you can get/show
> your data. Especially if your database is on a separate machine from
> your application server. Network transfers are *not* instantaneous.
>
> Try:
>
> SELECT col1,col2,col3,etc FROM Bigtable LIMIT x OFFSET y
>
> instead. You'll save yourself some time, save the database the effort
> of loading all million rows of every column in the table into the
> network interface, and save your application the need to fetch them.
>
> > COUNT(smallcolumn) behaves much faster than COUNT(*).
>
> Or you can do what every database optimization book in the entire
> universe says, and do COUNT(1). Since all you're testing is the
> existence of the row, not any value of anything in it.
>
> > Again, count should be streamed as well such as to use no
> > significant memory resources other than the counter.
>
> Actually, it's converted into: "*, eh? I think not. Let me rewrite
> this query and put a 1 here instead."
>
> The database tries to keep people from shooting themselves in the foot,
> but even the most Herculean effort will fail when the user is bound and
> determined to blow off their big toe.
>
> > Any enlightenments? Am I wrong? Will this be fixed soon?
> > Is it hard to change pgsql to do better streaming of its
> > operations.
>
> Read a few books on database design and optimization, the basics of
> application memory allocation, and the TCP/IP stack. If you want to
> understand how to make an application fast from front to back, you have
> to understand the components that make it work. Your knowledge of
> application memory performance and network latency seems inherently
> flawed, and until you get over the assumption that network transfers are
> free and optimizing queries is a fool's errand, you'll continue to have
> problems in any database you choose.
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Shaun Thomas <sthomas(at)townnews(dot)com>, Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>, pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 17:56:16
Message-ID: 200204111756.g3BHuGL13708@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Oleg Bartunov wrote:
> The big issue with LIMIT,OFFSET is that it still use all rows
> for sorting. I already suggested to use partial sorting to avoid
> sorting all rows if one selected only first 20 row, for example.
> It's very important for Web applications because web users usually
> read first 1-2 pages. Our experimnets have shown 6 times performance
> win when using partial sorting.

We do have this TODO item:

* Allow ORDER BY ... LIMIT to select top values without sort or index
using a sequential scan for highest/lowest values

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 17:56:19
Message-ID: 3CB5CE43.1060304@aurora.regenstrief.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Oleg Bartunov wrote:

> The big issue with LIMIT,OFFSET is that it still use all rows
> for sorting. I already suggested to use partial sorting to avoid
> sorting all rows if one selected only first 20 row, for example.
> It's very important for Web applications because web users usually
> read first 1-2 pages. Our experimnets have shown 6 times performance
> win when using partial sorting.

That is interesting. I haven't even dared to ask for this.
Did you contribute that partial sorting code to postgresql?

thanks,
-Gunther

--
Gunther Schadow, M.D., Ph.D. gschadow(at)regenstrief(dot)org
Medical Information Scientist Regenstrief Institute for Health Care
Adjunct Assistant Professor Indiana University School of Medicine
tel:1(317)630-7960 http://aurora.regenstrief.org


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 18:11:45
Message-ID: 200204111811.g3BIBju15228@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Gunther Schadow wrote:
> Oleg Bartunov wrote:
>
> > The big issue with LIMIT,OFFSET is that it still use all rows
> > for sorting. I already suggested to use partial sorting to avoid
> > sorting all rows if one selected only first 20 row, for example.
> > It's very important for Web applications because web users usually
> > read first 1-2 pages. Our experimnets have shown 6 times performance
> > win when using partial sorting.
>
>
> That is interesting. I haven't even dared to ask for this.
> Did you contribute that partial sorting code to postgresql?

We have the TODO item, but the code is not written yet.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: "David Siebert" <david(at)eclipsecat(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 18:19:46
Message-ID: OJEIJALIHAIBMMBFLCOBMEHNEFAA.david@eclipsecat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

I have had a simular issue when doing a select one a table of 10k or 15k
records. It takes a long time to get the data back. I am using JDBC.
If I take a subset of the table it returns very quickly. I figured that the
JDBC driver is buffering all the data. I guess I will have to try the cursor
or the LIMIT,OFFSET next as well.

-----Original Message-----
From: pgsql-general-owner(at)postgresql(dot)org
[mailto:pgsql-general-owner(at)postgresql(dot)org]On Behalf Of Oleg Bartunov
Sent: Thursday, April 11, 2002 1:51 PM
To: Shaun Thomas
Cc: Gunther Schadow; pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] Critical performance problems on large databases

The big issue with LIMIT,OFFSET is that it still use all rows
for sorting. I already suggested to use partial sorting to avoid
sorting all rows if one selected only first 20 row, for example.
It's very important for Web applications because web users usually
read first 1-2 pages. Our experimnets have shown 6 times performance
win when using partial sorting.

Oleg
On Thu, 11 Apr 2002, Shaun Thomas wrote:

> On Wed, 10 Apr 2002, Gunther Schadow wrote:
>
> > SELECT * FROM Bigtable;
> >
> > it takes a long time for it to come up with the first page
> > of results using up lots of computer resources etc and after
> > the first page is returned the backend basically goes into
> > idle mode for all the rest of the query results retrieval.
>
> Very simple. PHP and PERL DBI, when doing non-cursor queries, will try
> and buffer the *entire* result into memory before letting you have it.
> In PHP, when you run a pg_exec, or mysql_query, or whatever, it actually
> executes the query, and stores all of the results in memory. Don't
> believe me? Try closing the database connection, and then do a fetch on
> the result identifier. Heck, fetch until you reach the end.
>
> You'll get all of your results. Even though the database is closed.
>
> So when you are doing a "SELECT * FROM bigtable", you're telling PHP to
> not only buffer over a million rows, but to transfer it from the
> database as well. "SELECT COUNT(*) FROM bigtable" doesn't have
> this overhead. The database just returns a single row, which you
> can view basically as soon as the database is done.
>
> There's also something to be said for the LIMIT clause. Somehow I
> doubt you need every single row in the entire table in order to do
> your work on that page. Just adding LIMIT/OFFSET will increase your
> speed significantly.
>
> Or, as someone else mentioned, use a cursor, and let the database buffer
> your query. It'll be slow too, but you can at least fetch the rows
> individually and get the perception of speed in your application.
>
> > no work needed other than to retrieve the tuples out of
> > physical storage, the response should be immediate and resource
> > usage low. There should not be large buffer allocations.
>
> I see how you can get into this kind of thinking. Since select * has no
> ordering, no calculations, nothing but just returning raw results from
> the table in whatever format they may be in, how could it possibly be
> resource intensive? But the logic is wrong. If you want all the data,
> it'll give it to you. Whether you do it at the console, and it has to
> throw everything at the screen buffer, or your application, which has
> to put it in some temporary storage until it gets what it needs and
> deallocates the memory. That data has to go *somewhere*, it can't just
> exist in limbo until you decide you want to use some of it.
>
> You can use cursors to get around this, because the database basically
> runs the query and doesn't send back squat until you actually ask it to.
> But I'll tell you a well selected limit clause will work almost as well,
> and reduce the need for the database to maintain a cursor.
>
> Say you're on page 5, and you are showing 20 results per page. Knowing
> that results start at 0, you can get the offset just by doing:
>
> (PAGE - 1) * PER_PAGE = (5-1) * 20 = 4*20 = 80. So your query becomes:
>
> SELECT * FROM Bigtable LIMIT 20 OFFSET 80.
>
> And viola. You'll get back 20 rows right where you want them. Also,
> take care of what you select from the table. Maybe you don't actually
> need all of the data, but only certain rows. The less data that has to
> be transferred from database to application, the faster you can get/show
> your data. Especially if your database is on a separate machine from
> your application server. Network transfers are *not* instantaneous.
>
> Try:
>
> SELECT col1,col2,col3,etc FROM Bigtable LIMIT x OFFSET y
>
> instead. You'll save yourself some time, save the database the effort
> of loading all million rows of every column in the table into the
> network interface, and save your application the need to fetch them.
>
> > COUNT(smallcolumn) behaves much faster than COUNT(*).
>
> Or you can do what every database optimization book in the entire
> universe says, and do COUNT(1). Since all you're testing is the
> existence of the row, not any value of anything in it.
>
> > Again, count should be streamed as well such as to use no
> > significant memory resources other than the counter.
>
> Actually, it's converted into: "*, eh? I think not. Let me rewrite
> this query and put a 1 here instead."
>
> The database tries to keep people from shooting themselves in the foot,
> but even the most Herculean effort will fail when the user is bound and
> determined to blow off their big toe.
>
> > Any enlightenments? Am I wrong? Will this be fixed soon?
> > Is it hard to change pgsql to do better streaming of its
> > operations.
>
> Read a few books on database design and optimization, the basics of
> application memory allocation, and the TCP/IP stack. If you want to
> understand how to make an application fast from front to back, you have
> to understand the components that make it work. Your knowledge of
> application memory performance and network latency seems inherently
> flawed, and until you get over the assumption that network transfers are
> free and optimizing queries is a fool's errand, you'll continue to have
> problems in any database you choose.
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>
Cc: pgsql-general(at)postgresql(dot)org, Teodor Sigaev <teodor(at)stack(dot)net>
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 18:28:48
Message-ID: Pine.GSO.4.44.0204112124530.2819-100000@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, 11 Apr 2002, Gunther Schadow wrote:

> Oleg Bartunov wrote:
>
> > The big issue with LIMIT,OFFSET is that it still use all rows
> > for sorting. I already suggested to use partial sorting to avoid
> > sorting all rows if one selected only first 20 row, for example.
> > It's very important for Web applications because web users usually
> > read first 1-2 pages. Our experimnets have shown 6 times performance
> > win when using partial sorting.
>
>
> That is interesting. I haven't even dared to ask for this.
> Did you contribute that partial sorting code to postgresql?

A year ago we had a very crude patch which implement partial sorting.
We had hope to work close on it for 7.2/3 but seems we have no time.
If you're ready to work on this issue we could think about
contributing our libpsort to postgres.

>
> thanks,
> -Gunther
>
>
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Shaun Thomas <sthomas(at)townnews(dot)com>, Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 18:31:07
Message-ID: Pine.GSO.4.44.0204112129490.2819-100000@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, 11 Apr 2002, Bruce Momjian wrote:

> Oleg Bartunov wrote:
> > The big issue with LIMIT,OFFSET is that it still use all rows
> > for sorting. I already suggested to use partial sorting to avoid
> > sorting all rows if one selected only first 20 row, for example.
> > It's very important for Web applications because web users usually
> > read first 1-2 pages. Our experimnets have shown 6 times performance
> > win when using partial sorting.
>
> We do have this TODO item:
>
> * Allow ORDER BY ... LIMIT to select top values without sort or index
> using a sequential scan for highest/lowest values
>

looks too complex to me :-) I like partial sorting, but it's not
important.

>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 18:35:06
Message-ID: Pine.GSO.4.44.0204112131530.2819-100000@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, 11 Apr 2002, Bruce Momjian wrote:

> Gunther Schadow wrote:
> > Oleg Bartunov wrote:
> >
> > > The big issue with LIMIT,OFFSET is that it still use all rows
> > > for sorting. I already suggested to use partial sorting to avoid
> > > sorting all rows if one selected only first 20 row, for example.
> > > It's very important for Web applications because web users usually
> > > read first 1-2 pages. Our experimnets have shown 6 times performance
> > > win when using partial sorting.
> >
> >
> > That is interesting. I haven't even dared to ask for this.
> > Did you contribute that partial sorting code to postgresql?
>
>
> We have the TODO item, but the code is not written yet.
>

Actually, we have libpsort for partial sorting and very crude patch for 7.1
Jan and Tom were not optimist about that patch, so we need one bright
person who could incorporate the idea of partial sorting and current sources
and satisfy core developers. We have no time, unfortunately

>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


From: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-11 20:05:33
Message-ID: 3CB5EC8D.2060803@aurora.regenstrief.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Oleg Bartunov wrote:

> On Thu, 11 Apr 2002, Bruce Momjian wrote:
>
>
>>Oleg Bartunov wrote:
>>
>>>The big issue with LIMIT,OFFSET is that it still use all rows
>>>for sorting. I already suggested to use partial sorting to avoid
>>>sorting all rows if one selected only first 20 row, for example.
>>>It's very important for Web applications because web users usually
>>>read first 1-2 pages. Our experimnets have shown 6 times performance
>>>win when using partial sorting.
>>>
>>We do have this TODO item:
>>
>> * Allow ORDER BY ... LIMIT to select top values without sort or index
>> using a sequential scan for highest/lowest values
>>
>>
>
> looks too complex to me :-) I like partial sorting, but it's not
> important.

Oleg, I might argee. I might even take some of this one. But I
think a first step would be for you to put your libpsort and
the pacth "out there" so that someone could just look and try
and see if taking the time to do what needs to be done is feasible
with this. If you can put up your lib and patch and documentation
(even crude notes)) onto a web site, that would be a good start.

regards
-Gunther

--
Gunther Schadow, M.D., Ph.D. gschadow(at)regenstrief(dot)org
Medical Information Scientist Regenstrief Institute for Health Care
Adjunct Assistant Professor Indiana University School of Medicine
tel:1(317)630-7960 http://aurora.regenstrief.org


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-12 05:41:23
Message-ID: Pine.GSO.4.44.0204120836520.2819-100000@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Well,

I've got several responses..
I'm far from office and has slow and unstable connection, so I
will return to this topic next week.

Oleg
On Thu, 11 Apr 2002, Gunther Schadow wrote:

> Oleg Bartunov wrote:
>
> > On Thu, 11 Apr 2002, Bruce Momjian wrote:
> >
> >
> >>Oleg Bartunov wrote:
> >>
> >>>The big issue with LIMIT,OFFSET is that it still use all rows
> >>>for sorting. I already suggested to use partial sorting to avoid
> >>>sorting all rows if one selected only first 20 row, for example.
> >>>It's very important for Web applications because web users usually
> >>>read first 1-2 pages. Our experimnets have shown 6 times performance
> >>>win when using partial sorting.
> >>>
> >>We do have this TODO item:
> >>
> >> * Allow ORDER BY ... LIMIT to select top values without sort or index
> >> using a sequential scan for highest/lowest values
> >>
> >>
> >
> > looks too complex to me :-) I like partial sorting, but it's not
> > important.
>
>
> Oleg, I might argee. I might even take some of this one. But I
> think a first step would be for you to put your libpsort and
> the pacth "out there" so that someone could just look and try
> and see if taking the time to do what needs to be done is feasible
> with this. If you can put up your lib and patch and documentation
> (even crude notes)) onto a web site, that would be a good start.
>
> regards
> -Gunther
>
>
>
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


From: Lincoln Yeoh <lyeoh(at)pop(dot)jaring(dot)my>
To: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>, Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-12 14:49:17
Message-ID: 5.1.0.14.1.20020412223513.0265f020@192.228.128.13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

At 11:09 AM 4/11/02 -0500, Gunther Schadow wrote:
>There was one remark about Perl or PHP always loading the complete
>result set before returning. Bad for them. I don't use either and
>I think it's just bad design to do that on the client but I don't
>care about bad clients. I care about a good server.

>The constructive responses suggested that I use LIMIT/OFFSET and
>CURSORs. I can see how that could be a workaround the problem, but
>I still believe that something is wrong with the PostgreSQL query
>executer. Loading the entire result set into a buffer without
>need just makes no sense. Good data base engines try to provide

AFAIK you can turn off buffering the whole result set with perl DBI/DBD.

Buffering allows programmers to easily use each row of a query to make
other queries, even if the DB doesn't support cursors.

e.g.
select ID from table1;
for each row returned {
select ID2 from table2 where x=ID;
for each row returned:
....
}

Without buffering or cursors you will have to open another connection or
store all rows somewhere then do the subsequent queries using the stored rows.

Cheerio,
Link.


From: Jean-Luc Lachance <jllachan(at)nsd(dot)ca>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Shaun Thomas <sthomas(at)townnews(dot)com>, Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>, pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-12 15:52:36
Message-ID: 3CB702C4.64C1EFE5@nsd.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Bruce Momjian wrote:
>
> Oleg Bartunov wrote:
> > The big issue with LIMIT,OFFSET is that it still use all rows
> > for sorting. I already suggested to use partial sorting to avoid
> > sorting all rows if one selected only first 20 row, for example.
> > It's very important for Web applications because web users usually
> > read first 1-2 pages. Our experimnets have shown 6 times performance
> > win when using partial sorting.
>
> We do have this TODO item:
>
> * Allow ORDER BY ... LIMIT to select top values without sort or index
> using a sequential scan for highest/lowest values

It may not be wise to disregard the index if there is one for the ORDER
BY.

JLL

>
> --
> Bruce Momjian | http://candle.pha.pa.us
> pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
> + If your life is a hard drive, | 830 Blythe Avenue
> + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Jean-Luc Lachance <jllachan(at)nsd(dot)ca>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Shaun Thomas <sthomas(at)townnews(dot)com>, Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>, pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-12 16:19:23
Message-ID: 200204121619.g3CGJNx10403@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Jean-Luc Lachance wrote:
> Bruce Momjian wrote:
> >
> > Oleg Bartunov wrote:
> > > The big issue with LIMIT,OFFSET is that it still use all rows
> > > for sorting. I already suggested to use partial sorting to avoid
> > > sorting all rows if one selected only first 20 row, for example.
> > > It's very important for Web applications because web users usually
> > > read first 1-2 pages. Our experimnets have shown 6 times performance
> > > win when using partial sorting.
> >
> > We do have this TODO item:
> >
> > * Allow ORDER BY ... LIMIT to select top values without sort or index
> > using a sequential scan for highest/lowest values
>
> It may not be wise to disregard the index if there is one for the ORDER
> BY.

Yes, this would be for cases where the index isn't useful, either
because ther isn't on or the join order makes it useless.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-12 20:47:30
Message-ID: 10433.1018644450@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> Thanks, Tom, that's helpful! So, are you saying the the psql
> client does all the buffering and not the server? Is the
> buffering mandatory when using the libpq API? When you say
> "most frontend client side libraries" do you know of any
> exceptions who don't do that? Is the odbc client doing this
> too?

Yup, yup, and all the ones I know about but I do not know the details of
several (including jdbc and odbc).

> I can see myself fixing this problem, but would like to know
> if there is something else already?

Not that I know of.

I could imagine extending libpq reasonably straightforwardly:
assuming that you use PQsendQuery/PQgetResult, you could add a
PQgetPartialResult() that would extract the rows-received-so-far.
When it fails to get any more rows, a final PQgetResult would get the
query's success/fail status and possible error message. Not sure about
all the details here; if you want to pursue it, start a discussion on
pgsql-interfaces.

regards, tom lane


From: will trillich <will(at)serensoft(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-13 15:23:19
Message-ID: 20020413102319.A17204@serensoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, Apr 11, 2002 at 11:09:05AM -0500, Gunther Schadow wrote:
[snip]
> The constructive responses suggested that I use LIMIT/OFFSET and
> CURSORs. I can see how that could be a workaround the problem, but
> I still believe that something is wrong with the PostgreSQL query
> executer.

<editorialize emphasis="mine" text="yours">

> LOADING THE ENTIRE RESULT SET INTO A BUFFER WITHOUT
> NEED JUST MAKES NO SENSE.

</editorialize>

when YOU say "SELECT * FROM SOMETABLE" that's what YOU are doing.
the software tries to accomodate your request as best it can.

> Good data base engines try to provide
> for parallel execution of the query plan as much as possible, and
> that implies streaming. There's a crowd of literature about this
> testifying for it's importance.

if you need the first twenty or the last fifty rows for your
purpose, then use limit & offset, or try the cursor approach. if
you NEED all the rows (a very unusual occurrence outside of
mirroring instances -- even periodic summation, for report
generation, will use aggregation instead of all individual rows)
then ask for them.

--
I use Debian/GNU Linux version 2.2;
Linux server 2.2.17 #1 Sun Jun 25 09:24:41 EST 2000 i586 unknown

DEBIAN NEWBIE TIP #63 from Will Trillich <will(at)serensoft(dot)com>
:
What's the best way to GET RESPONSES ON DEBIAN-USER? There are
several things to keep in mind:
1) Debians are all volunteers because they enjoy what they
do; they don't owe you diddly (and you'll be one of us
when you start getting involved): ASK, and ye shall
recieve; DEMAND, and ye shall be rebuffed
2) Provide evidence showing that you did put effort into
finding a solution to your problem (at least demonstrate
that you've seen the manual)
3) Be known to offer pointers and assistance to others
4) Give enough information so that someone else can figure
out what you're after; and make it legible
5) Enjoy yourself and have fun -- it'll come across, and we
enjoy people who enjoy life; a petulant whiner seldom
gets any useful pointers other than "Out, damn spot!"

Also see http://newbieDoc.sourceForge.net/ ...


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>, pgsql-general(at)postgresql(dot)org
Subject: Re: Critical performance problems on large databases
Date: 2002-04-18 04:04:04
Message-ID: 200204180404.g3I444d06243@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


Oleg, I have added your name to this TODO list:

* Allow ORDER BY ... LIMIT to select top values without sort or index
using a sequential scan for highest/lowest values (Oleg)

---------------------------------------------------------------------------

Oleg Bartunov wrote:
> On Thu, 11 Apr 2002, Bruce Momjian wrote:
>
> > Gunther Schadow wrote:
> > > Oleg Bartunov wrote:
> > >
> > > > The big issue with LIMIT,OFFSET is that it still use all rows
> > > > for sorting. I already suggested to use partial sorting to avoid
> > > > sorting all rows if one selected only first 20 row, for example.
> > > > It's very important for Web applications because web users usually
> > > > read first 1-2 pages. Our experimnets have shown 6 times performance
> > > > win when using partial sorting.
> > >
> > >
> > > That is interesting. I haven't even dared to ask for this.
> > > Did you contribute that partial sorting code to postgresql?
> >
> >
> > We have the TODO item, but the code is not written yet.
> >
>
> Actually, we have libpsort for partial sorting and very crude patch for 7.1
> Jan and Tom were not optimist about that patch, so we need one bright
> person who could incorporate the idea of partial sorting and current sources
> and satisfy core developers. We have no time, unfortunately
>
>
> >
>
> Regards,
> Oleg
> _____________________________________________________________
> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> Sternberg Astronomical Institute, Moscow University (Russia)
> Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
> phone: +007(095)939-16-83, +007(095)939-23-83
>
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: Francisco Reyes <lists(at)natserv(dot)com>
To: Bill Gribble <grib(at)linuxdevel(dot)com>
Cc: Gunther Schadow <gunther(at)aurora(dot)regenstrief(dot)org>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Critical performance problems on large databases
Date: 2002-04-19 16:15:12
Message-ID: 20020419121330.L76398-100000@zoraida.natserv.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On 11 Apr 2002, Bill Gribble wrote:

> On Wed, 2002-04-10 at 17:39, Gunther Schadow wrote:
>
> Then the biggest slowdown is count(*), which we have to do in order to
> fake up the scrollbar (so we know what proportion of the data has been
> scrolled through). I have not completely foxed this yet. I want to
> keep a separate mini-table of how many records are in the big table and
> update it with a trigger (the table is mostly static). ATM, I just try
> hard to minimize the times I call count(*).

Dont' recall right now which table, but there is a table that has record
counts. It is accurate if you have done a vacuum full. Vacuum analyze only
gives you estimates, but that may be good enough for your needs.