Postgres 7.3.1 poor insert/update/search performance esp WRT Oracle

From: "Seth Robertson" <pgsql-performance(at)sysd(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Postgres 7.3.1 poor insert/update/search performance esp WRT Oracle
Date: 2003-01-21 21:20:49
Message-ID: 200301212120.h0LLKn820463@winwood.sysdetect.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I'm trying to get converted over to Postgres from Oracle (Postgres is
billions of times more straightforward and pragmatically clean than
Oracle), but I'm having some severe performance problems on what
naively appears to be a very straightforward dead-simple test.

The test is comprised of two parts: a write part which attempts to
accumulate (sum) numbers by distinct keys, and a read part which
searches for keys in the database (some of which will be present, some
of which will not). In a more realistic scenario, both will be
happening all of the time, but we can start off easy.

However, performance is terrible: around 39 write transactions/second
and 69 searches/second. Oracle, by comparison, writes at 314 and
reads at 395--practically an order of magnitude better performance.
Both are using the same hardware (obviously not at the same time)
which is a dual-processor AMD 2000+ with 3GB memory and both oracle
and postgres loaded on a 105GB ``MD'' striped (no redundancy) 2 SCSI
disks running ext3 fs (no special flags) with Linux 2.4.18-10smp.

I actually have seven different schemes for performing the writes
using Postgres:

----------------------------------------------------------------------
"normal" C libpq 39 t/s
"normal" Perl DBI 39 t/s
"DBI Prepared Statement" Perl DBI 39 t/s
"Batching" Perl DBI 45 t/s
"arrays" Perl DBI 26 t/s
"server-side function" Perl DBI 39 t/s
"server-side trigger" Perl DBI 39 t/s
"normal" Perl DBI read 69 t/s
"normal" Perl DBI for Oracle 314 t/s
"normal" Perl DBI read for Oracle 395 t/s
----------------------------------------------------------------------

Only batching had a statistically significant improvement, and it
wasn't that major. I couldn't use true Postgres prepared statements
since you cannot determine the success/failure of the statements yet.
I was planning on using arrays as well, but the additional 33%
performance impact is not amusing (though I suppose it is only an
additional 3% if you consider the 87% performance drop of Postgres
from Oracle).

I'll include all methods in the attached file, but since there was no
significant difference, I'll concentrate on the basic one:

Example table:
----------------------------------------------------------------------
CREATE TABLE test (
val BIGINT PRIMARY KEY, # "vals" may be between 0 and 2^32-1
accum INTEGER
);
----------------------------------------------------------------------

Basic algorithm for writes
----------------------------------------------------------------------
while (<>)
{
chomp;
@A = split;

if (dosql($dbh, "UPDATE test SET accum = accum + $A[1] WHERE val = '$A[0]';",0) eq "0E0")
{
dosql($dbh, "INSERT INTO test VALUES ( $A[0], $A[1] );");
}
}
----------------------------------------------------------------------

Basic algorithm for reads
----------------------------------------------------------------------
while (<>)
{
chomp;
@A = split;
$sth = querysql($dbh,"SELECT accum FROM test WHERE val = $A[0];");
$hit++ if ($sth && ($row = $sth->fetchrow_arrayref));
$tot++;
}
----------------------------------------------------------------------

What could be simpler.

In my randomly generated write data, I usually have about 18K inserts
and 82K updates. In my randomly generated read data, I have 100K keys
which will be found and 100K keys which will not be found.

The postgresql.conf file is default (my sysadmin nuked all of my
changes when he upgraded to 7.3.1--grr) and there are some shared
memory configs: kernel.sem = 250 32000 100 128, kernel.shmmax =
2147483648, kernel.shmmni = 100, kernel.shmmax = 134217728 The
WAL is not seperated (but see below).

A "vacuum analyze" is performed between the write phase and the read
phase. However, for your analysis pleasure, here are the results
of a full verbose analyze and some explain results (both before and after).

/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
seth=> explain update test set accum = accum + 53 where val = '5';
QUERY PLAN
-----------------------------------------------------
Seq Scan on test (cost=0.00..0.00 rows=1 width=18)
Filter: (val = 5::bigint)
(2 rows)
seth=> explain insert into test values (5, 53);
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=0)
(1 row)
seth=> vacuum full verbose analyze test;
INFO: --Relation public.test--
INFO: Pages 541: Changed 2, reaped 540, Empty 0, New 0; Tup 18153: Vac 81847, Keep/VTL 0/0, UnUsed 0, MinLen 40, MaxLen 40; Re-using: Free/Avail. Space 3294932/3294932; EndEmpty/Avail. Pages 0/541.
CPU 0.00s/0.03u sec elapsed 0.02 sec.
INFO: Index test_pkey: Pages 355; Tuples 18153: Deleted 81847.
CPU 0.03s/0.34u sec elapsed 0.65 sec.
INFO: Rel test: Pages: 541 --> 99; Tuple(s) moved: 18123.
CPU 1.01s/0.31u sec elapsed 9.65 sec.
INFO: Index test_pkey: Pages 355; Tuples 18153: Deleted 18123.
CPU 0.02s/0.06u sec elapsed 0.19 sec.
INFO: Analyzing public.test
VACUUM

seth=> explain select accum from test where val = 5;
QUERY PLAN
----------------------------------------------------------------------
Index Scan using test_pkey on test (cost=0.00..5.99 rows=1 width=4)
Index Cond: (val = 5)
(2 rows)
seth=> explain update test set accum = accum + 53 where val = '5';
QUERY PLAN
-----------------------------------------------------------------------
Index Scan using test_pkey on test (cost=0.00..5.99 rows=1 width=18)
Index Cond: (val = 5::bigint)
(2 rows)
seth=> explain insert into test values (5, 53);
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=0)
(1 row)
/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\

I certainly understand that using an index scan might well speed
things up WRT the update policy, but considering the search
performance is post-analyze (pre-analyze it is even more deadly slow),
I am dubious that doing it during the updates will get me within
striking distance of Oracle since read performance has got to be
better than write performance, right?. This is also why I am dubious
that moving the WAL to another filesystem or futzing with the fsync
policy will do anything.

I will include below a compressed tarball of the programs I used (and
the corresponding RUNME script) in case you wish to play along at
home. I don't claim they are pretty, BTW :-)

-Seth Robertson

Attachment Content-Type Size
perftest.tar.gz application/tar 4.9 KB

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Stephan Szabo 2003-01-21 21:46:17 Re: Postgres 7.3.1 poor insert/update/search performance
Previous Message James H. Cloos Jr. 2003-01-21 08:21:03 Re: x86-64 and PostgreSQL