Skip site navigation (1) Skip section navigation (2)

Peripheral Links

Header And Logo

PostgreSQL
| The world's most advanced open source database.

Site Navigation

Search for
  Advanced Search

Re: TOAST compression


  • From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
  • To: "Jim C. Nasby" <jnasby(at)pervasive(dot)com>, "Neil Conway" <neilc(at)samurai(dot)com>
  • Cc: "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
  • Subject: Re: TOAST compression
  • Date: Sun, 26 Feb 2006 09:31:05 -0800
  • Message-id: <C02725D9(dot)1DD73%llonergan(at)greenplum(dot)com>

Jim,

On 2/26/06 8:00 AM, "Jim C. Nasby" <jnasby(at)pervasive(dot)com> wrote:

> Any idea on how decompression time compares to IO bandwidth? In other
> words, how long does it take to decompress 1MB vs read that 1MB vs read
> whatever the uncompressed size is?

On DBT-3 data, I've just run some tests meant to simulate the speed
differences of compression versus native I/O.  My thought is that an
external use of gzip on a binary dump file should be close to the speed of
LZW on toasted fields, so I just dumped the "supplier" table (see below) of
size 202MB in data pages to disk, then ran gzip/gunzip on the the binary
file.  Second test - an 8k block dd from that same file, meant to simulate a
seq scan (it's faster by 25% than doing it in PG though):

==================== gzip/gunzip =====================
[mppdemo1(at)salerno0]$ ls -l supplier.bin
-rw-r--r--  1 mppdemo1 mppdemo1 177494266 Feb 26 09:17 supplier.bin

[mppdemo1(at)salerno0]$ time gzip supplier.bin

real    0m12.979s
user    0m12.558s
sys     0m0.400s
[mppdemo1(at)salerno0]$ time gunzip supplier.bin

real    0m2.286s
user    0m1.713s
sys     0m0.573s
[mppdemo1(at)salerno0]$ time dd if=supplier.bin of=/dev/null bs=8k
21666+1 records in
21666+1 records out

real    0m0.138s
user    0m0.003s
sys     0m0.135s

==================== Interpretation =====================
Zip speed: 177.5 MB in 13 seconds, or 13.6 MB/s
Unzip speed: 177.5 MB in 2.29 seconds, or 77.5 MB/s
Direct access speed: 177.5 MB in 0.138 seconds or 1,286 MB/s

Note that this filesystem can do about 400MB/s, and we routinely see scan
rates of 300MB/s within PG, so the real comparision is:

Direct seqscan at 300MB/s versus gunzip at 77.5MB/s

==================== Background data =====================
demo=# \d supplier
             Table "public.supplier"
   Column    |          Type          | Modifiers
-------------+------------------------+-----------
 s_suppkey   | integer                | not null
 s_name      | character(25)          | not null
 s_address   | character varying(40)  | not null
 s_nationkey | integer                | not null
 s_phone     | character(15)          | not null
 s_acctbal   | numeric(15,2)          | not null
 s_comment   | character varying(101) | not null

demo=# select relname,8*relpages/128 as MB from pg_class order by relpages
desc limit 6;
 relname  |   mb   
----------+--------
 lineitem | 123434
 orders   |  24907
 partsupp |  14785
 part     |   3997
 customer |   3293
 supplier |    202
(6 rows)

Time: 2.024 ms
demo=# copy supplier to '/tmp/supplier.bin' with binary;
COPY
Time: 7328.186 ms
demo=# copy supplier to '/tmp/supplier.txt';
COPY
Time: 5503.168 ms

******** Note how the text file dumps faster than binary, and it's smaller
******** at 148MB.

demo=# select version();
                   
version            
----------------------------------------------------------------------------
----------------------------------------------------------------------------
-----
 PostgreSQL 8.1.3 (Bizgres MPP 2.1) on x86_64-unknown-linux-gnu, compiled by
GCC gcc (GCC) 3.4.4 20050721 (Red Hat 3.4.4-2) compiled on Feb 23 2006
11:34:06
(1 row)






Home | Main Index | Thread Index

Privacy Policy | PostgreSQL Archives hosted by Command Prompt, Inc. | Designed by tinysofa
Copyright © 1996 – 2008 PostgreSQL Global Development Group