prelimiary performance comparison pgsql vs mysql

From: "Rick Schumeyer" <rschumeyer(at)ieee(dot)org>
To: "'PgSql General'" <pgsql-general(at)postgresql(dot)org>
Subject: prelimiary performance comparison pgsql vs mysql
Date: 2005-03-14 11:52:58
Message-ID: 010001c5288c$5e3b3c40$0200a8c0@dell8200
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Below are some PRELIMINARY results in comparing the performance of pgsql and
mysql.

These results are for a single process populating a table with 934k rows,
and then performing some selects. I also compared the effect of creating
indexes on some of the columns.

I have not yet done any testing of transactions, multiple concurrent
processes, etc.

I did not make any changes to the default config settings. I can do
so if someone has some suggestions.

My machine is a 3.0 GHz P4 with 1 GB ram, running FC 3.

I used pg 8.0.1 and mysql 5.0.2 alpha.

I compiled pg from source, but I downloaded an binary for mysql. If
someone thinks this is significant, and can point me to a good
binary for pg, I will give it a try.

All timings are as reported by the db.

I included the pg script below..

Finally, I don't have an axe to grind either way. I'm trying to be
fair, but I am the first to admit I am not an expert in db tuning.
I welcome constructive comments and advice.

**************** data and queries

The data comes from some experimental data I have been working with.

I load data into a table with relavant columns fid, rid, x.
The combination of fid,rid is unique.
x is a location, and is not unique.

I loaded the data using COPY (pg) and LOAD (mysql).

The queries were:

select count(*) from data where fid=2 and rid=6; count = 100
select count(*) from data where x > 5000 and x < 5500; count = 35986
select count(*) from data where x > 5000 and x < 5020; count = 1525

***************** preliminary conclusions

As suspected, MyISAM is very fast. In the tested case (only
one process, only one big load and some selects) MyISAM tables are
much faster than pg or InnoDB.

For queries, InnoDB and pg are roughly equivalent. In some cases
one or the other is a little faster, but they are mostly in the
same ballpark. The one exception seems to be that pg has an edge
in seq scans.

pg is slower loading data when it has to create an index. Also,
I found that is is critical to run "vacuum analyze" in pg. Running
"analyze" in mysql did not seem to make much difference. I'm guessing
that mysql builds statistics while it is loading data, and does not
actually run an analyze since the table has not changed.

******************* preliminary results ***************************
******************* all times in seconds **************************

note: input table has 934500 rows.

mysql 5.0.2 alpha
PG 8.0.1 MyISAM InnoDB
NO INDEXES

Load file 22.3 3.9 22.1
select count fid=?,rid=? 3.0 0.23 2.07
select count x > 5000, x < 5500 1.2 0.27 1.59
select count x > 5000, x < 5020 0.63 0.29 1.58

INDEXES on (fid,rid)

Load file 36. 13.5 30.1
vacuum analyze 3.6
select count fid=?,rid=? 0.0 0.00 0.02
select count x > 5000, x < 5500 0.702 0.29 2.07
select count x > 5000, x < 5020 0.713 0.28 1.59

INDEXES on (fid,rid) and (x)

Load file 202. 24. 151.
vacuum analyze 11.
select count fid=?,rid=? 0.002 0.00 0.02

select count x > 5000, x < 5500 0.9 0.06 0.75
select count x > 5000, x < 5020 0.048 0.01 0.01

********************* PG-SQL script ************************

\timing

--
-- Load table, no indexes
--

drop table data cascade;

create table data (
fid integer,
rid integer,
range real,
x real,
y real,
z real,
bs real,
snr real,
rvel real,
cfar smallint);

COPY data (fid,rid,range,snr,bs,rvel,cfar,x,y,z) FROM
'/home/rick/bed/data/data.dat';
select count(*) from data where fid=2 and rid=6;
select count(*) from data where x > 5000 and x < 5500;
select count(*) from data where x > 5000 and x < 5020;

--
-- Load table, index on (fid,rid)
--

drop table data cascade;

create table data (
fid integer,
rid integer,
range real,
x real,
y real,
z real,
bs real,
snr real,
rvel real,
cfar smallint);

create index fidrid_data on data (fid,rid);
COPY data (fid,rid,range,snr,bs,rvel,cfar,x,y,z) FROM
'/home/rick/bed/data/data.dat';
vacuum analyze data;
select count(*) from data where fid=2 and rid=6;
select count(*) from data where x > 5000 and x < 5500;
select count(*) from data where x > 5000 and x < 5020;

--
-- Load table, index on (fid,rid) and (x)
--

drop table data cascade;

create table data (
fid integer,
rid integer,
range real,
x real,
y real,
z real,
bs real,
snr real,
rvel real,
cfar smallint);

create index fidrid_data on data (fid,rid);
create index fidx on data (x);
COPY data (fid,rid,range,snr,bs,rvel,cfar,x,y,z) FROM
'/home/rick/bed/data/data.dat';
vacuum analyze data;
select count(*) from data where fid=2 and rid=6;
select count(*) from data where x > 5000 and x < 5500;
select count(*) from data where x > 5000 and x < 5020;

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Net Virtual Mailing Lists 2005-03-14 12:11:45 Re: Peculiar performance observation....
Previous Message Magnus Hagander 2005-03-14 11:45:08 Re: New user: Windows, Postgresql, Python