Re: Data Warehouse Reevaluation - MySQL vs Postgres

From: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>
To: Mark Cotner <mcotner(at)yahoo(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Data Warehouse Reevaluation - MySQL vs Postgres
Date: 2004-09-12 05:22:48
Message-ID: Pine.LNX.4.44.0409120655260.9559-100000@zigo.dhs.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, 11 Sep 2004, Mark Cotner wrote:

> There are no performance concerns with MySQL, but it would benefit
> greatly from stored procedures, views, etc. It is a very large rolling
> data warehouse that inserts about 4.5 million rows every 2 hours and
> subsequently rolls this data off the back end of a 90 day window.

While it is impossible to know without testing, postgresql has the benefit
of readers and writers that does not block each other. So in situations
where you do lots of concurrent inserts and selects postgresql should
behave well.

> Merge table definition equivalent. We use these extensively.

As far as I can tell a merge table in mysql is the same as a view over a
number of unions of other tables. And possibly a rule that defines how
inserts will be done if you do inserts in the merged table.

> Merge table equivalent with all tables containing over 100M rows(and
> about 40 columns, some quite wide) will need to do index scans in at
> least 5 seconds(MySQL currently does 2, but we can live with 5) and
> return ~200 rows.

Since each table that are merged will have it's own index the speed should
be proportional to the number of tables. Index scans in them self are very
fast, and of you have 30 tables you need 30 index scans.

Also, are you sure you really need merge tables? With pg having row locks
and mvcc, maybe you could go for a simpler model with just one big table.
Often you can also combine that with partial indexes to get a smaller
index to use for lots of your queries.

> Thoughts, suggestions?

I see nothing in what you have written that indicates that pg can not do
the job, and do it well. It's however very hard to know exactly what is
the bottleneck before one tries. There are lots of cases where people have
converted mysql applications to postgresql and have gotten a massive
speedup. You could be lucky and have such a case, who knows..

I spend some time each day supporting people using postgresql in the
#postgresql irc channel (on the freenode.net network). There I talk to
people doing both small and big conversions and the majority is very happy
with the postgresql performance. Postgresql have gotten faster and faster
with each release and while speed was a fair argument a number of years
ago it's not like that today.

That said, in the end it depends on the application.

We are all interested in how it goes (well, at least me :-), so feel free
to send more mails keeping us posted. Good luck.

--
/Dennis Björklund

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Gaetano Mendola 2004-09-12 09:07:34 Re: Data Warehouse Reevaluation - MySQL vs Postgres
Previous Message Mark Cotner 2004-09-12 04:24:42 Data Warehouse Reevaluation - MySQL vs Postgres