Skip site navigation (1) Skip section navigation (2)

Peripheral Links

Header And Logo

PostgreSQL
| The world's most advanced open source database.

Site Navigation

Search for
  Advanced Search

Re: [PERFORM] A Better External Sort?


  • From: Pailloncy Jean-Gerard <jg(at)rilk(dot)com>
  • Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, pgsql-performance(at)postgresql(dot)org
  • Subject: Re: [PERFORM] A Better External Sort?
  • Date: Thu, 29 Sep 2005 13:11:45 +0200
  • Message-id: <5A8CC216-2E25-4B8C-910A-69224D0A66CE(at)rilk(dot)com>

Your main example seems to focus on a large table where a key column has
constrained values.  This case is interesting in proportion to the
number of possible values. If I have billions of rows, each having one
of only two values, I can think of a trivial and very fast method of
returning the table "sorted" by that key: make two sequential passes,
returning the first value on the first pass and the second value on the
second pass.  This will be faster than the method you propose.


1= No that was not my main example. It was the simplest example used to frame the later more complicated examples. Please don't get hung up on it.

2= You are incorrect. Since IO is the most expensive operation we can do, any method that makes two passes through the data at top scanning speed will take at least 2x as long as any method that only takes one such pass.
You do not get the point.
As the time you get the sorted references to the tuples, you need to fetch the tuples themself, check their visbility, etc. and returns them to the client.

So,
if there is only 2 values in the column of big table that is larger than available RAM,
two seq scans of the table without any sorting
is the fastest solution.

Cordialement,
Jean-Gérard Pailloncy




Home | Main Index | Thread Index

Privacy Policy | PostgreSQL Archives hosted by Command Prompt, Inc. | Designed by tinysofa
Copyright © 1996 – 2008 PostgreSQL Global Development Group