Re: Probable faq: need some benchmarks of pgsql vr.s mysql

Lists: pgsql-advocacy
From: Brian Hurt <bhurt(at)spnz(dot)org>
To: pgsql-advocacy(at)postgresql(dot)org
Subject: Probable faq: need some benchmarks of pgsql vr.s mysql
Date: 2010-10-29 12:58:07
Message-ID: alpine.DEB.2.00.1010290839400.31170@sergyar
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy


My company is currently hitting a problem with mysql/innodb having really
slow insert performance (we're seeing ~1K rows/second). My boss wants to
go and spend a bunch of money on the Tokutek backend. I'd rather we save
the money and go to postgres instead. We're not heavily invested in mysql
at this point (fixing our queries to switch from mysql to postgres would
take about five minutes). But my boss wants to see some benchmarks.

I've googled around for a while, but all the benchmarks I've found commit
one or more "fatal flaws", which render the benchmark pointless at best:

1) Comparing Postgres to MyISAM. Transactions are not an option for us,
so it doesn't matter if MyISAM is a hundred times faster. I want to
compare Postgres to InnoDB (bonus points for Postgres vr.s Tokutek).

2) Using the default configurations. Be serious- is there any one who
cares the least about performance who uses the default configuration?

3) Using old versions of Postgres. I'd like the survey to at least use
the 8.x series, bonus points for it being 9.x.

4) Not using COPY for inserts. We would, of course, be using the copy
command for inserts.

Here's the thing. I have personally seen postgres 8.1 insert 30K
rows/second, in to a real table, on crappy hardware (single slow IDE
drive, old crappy hardware). I would be shocked if I can't improve on the
InnoDB numbers by at least an order of magnitude. I'm whipping together a
personal benchmark to show this. But I need a "professional looking"
benchmark, with pretty charts and graphs and etc., to back me up.

Help?

Brian


From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Brian Hurt <bhurt(at)spnz(dot)org>
Cc: pgsql-advocacy(at)postgresql(dot)org
Subject: Re: Probable faq: need some benchmarks of pgsql vr.s mysql
Date: 2010-10-29 16:06:59
Message-ID: 4CCAF123.7000409@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy

On 10/29/2010 02:58 PM, Brian Hurt wrote:
>
> My company is currently hitting a problem with mysql/innodb having
> really slow insert performance (we're seeing ~1K rows/second). My boss
> wants to go and spend a bunch of money on the Tokutek backend. I'd
> rather we save the money and go to postgres instead. We're not heavily
> invested in mysql at this point (fixing our queries to switch from mysql
> to postgres would take about five minutes). But my boss wants to see
> some benchmarks.

hmm 1k/s sounds very slow for MySQL/innodb if you are batching your
inserts (either multi-value inserts or larger transactions) - is there
anything special to that data(very wide, enourmous number of indexes etc)?

>
> I've googled around for a while, but all the benchmarks I've found
> commit one or more "fatal flaws", which render the benchmark pointless
> at best:
>
> 1) Comparing Postgres to MyISAM. Transactions are not an option for us,
> so it doesn't matter if MyISAM is a hundred times faster. I want to
> compare Postgres to InnoDB (bonus points for Postgres vr.s Tokutek).

never seen a comparison against Tokutek so you might have to benchmark
for yourself.

>
> 2) Using the default configurations. Be serious- is there any one who
> cares the least about performance who uses the default configuration?
>
> 3) Using old versions of Postgres. I'd like the survey to at least use
> the 8.x series, bonus points for it being 9.x.
>
> 4) Not using COPY for inserts. We would, of course, be using the copy
> command for inserts.
>
> Here's the thing. I have personally seen postgres 8.1 insert 30K
> rows/second, in to a real table, on crappy hardware (single slow IDE
> drive, old crappy hardware). I would be shocked if I can't improve on
> the InnoDB numbers by at least an order of magnitude. I'm whipping
> together a personal benchmark to show this. But I need a "professional
> looking" benchmark, with pretty charts and graphs and etc., to back me up.

Not sure what you would consider "professional" - but I did some testing
back in the 8.4 days here:
http://www.kaltenbrunner.cc/blog/index.php?/archives/27-Benchmarking-8.4-Chapter-2bulk-loading.html
- not sure if that actually matches your workload(but I guess you could
easily test yourself if it is that simple to convert your app).

>
> Help?

I don't think advocacy is actually the right list to ask maybe you would
get a wider audience on -performance or -general.

Stefan


From: Rob Wultsch <wultsch(at)gmail(dot)com>
To: Brian Hurt <bhurt(at)spnz(dot)org>
Cc: pgsql-advocacy(at)postgresql(dot)org
Subject: Re: Probable faq: need some benchmarks of pgsql vr.s mysql
Date: 2010-10-29 16:14:34
Message-ID: AANLkTin39LCLtXJb+9gmA-jMCvMTHnPQcBLFkujiPnV+@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy

On Fri, Oct 29, 2010 at 5:58 AM, Brian Hurt <bhurt(at)spnz(dot)org> wrote:
>
> My company is currently hitting a problem with mysql/innodb having really
> slow insert performance (we're seeing ~1K rows/second).  My boss wants to go
> and spend a bunch of money on the Tokutek backend.  I'd rather we save the
> money and go to postgres instead.  We're not heavily invested in mysql at
> this point (fixing our queries to switch from mysql to postgres would take
> about five minutes).  But my boss wants to see some benchmarks.
>
> I've googled around for a while, but all the benchmarks I've found commit
> one or more "fatal flaws", which render the benchmark pointless at best:
>
> 1) Comparing Postgres to MyISAM.  Transactions are not an option for us, so
> it doesn't matter if MyISAM is a hundred times faster.  I want to compare
> Postgres to InnoDB (bonus points for Postgres vr.s Tokutek).
>
> 2) Using the default configurations.  Be serious- is there any one who cares
> the least about performance who uses the default configuration?
>
> 3) Using old versions of Postgres.  I'd like the survey to at least use the
> 8.x series, bonus points for it being 9.x.
>
> 4) Not using COPY for inserts.  We would, of course, be using the copy
> command for inserts.
>
> Here's the thing.  I have personally seen postgres 8.1 insert 30K
> rows/second, in to a real table, on crappy hardware (single slow IDE drive,
> old crappy hardware).  I would be shocked if I can't improve on the InnoDB
> numbers by at least an order of magnitude.  I'm whipping together a personal
> benchmark to show this.  But I need a "professional looking" benchmark, with
> pretty charts and graphs and etc., to back me up.
>
> Help?
>
> Brian

If you post your conf I can give you pointers on InnoDB performance.
Please also post a hardware profile (including ram and IO hardware).
One way or another this is the right first step for a comparison with
PG.

As for tokutek, I think they are probably the wave of the future. I
have suggested that they create a pg product.

--
Rob Wultsch
wultsch(at)gmail(dot)com


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-advocacy(at)postgresql(dot)org
Subject: Re: Probable faq: need some benchmarks of pgsql vr.s mysql
Date: 2010-10-29 17:25:42
Message-ID: 4CCB0396.1000807@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy

Brian,

> My company is currently hitting a problem with mysql/innodb having
> really slow insert performance (we're seeing ~1K rows/second).

As others have said, there's clearly something wrong with your setup
and/or application which would likely affect PostgreSQL as well. I can
name several ways in which PostgreSQL is better than MySQL/InnoDB, but
insert performance isn't one of them.

>My boss
> wants to go and spend a bunch of money on the Tokutek backend.

I'm sure that Tokutek would be thrilled to have a customer (they're very
new). Again, though, insert performance isn't Tokutek's specialty; the
arguments for using them is that (a) they're not owned by oracle, and
(b) the "fractal indexes", which are cool.

> I've googled around for a while, but all the benchmarks I've found
> commit one or more "fatal flaws", which render the benchmark pointless
> at best:

Doing real benchmarks is an involved, expensive process, and both DBMSes
are fast-moving targets which need to be benchmarked every year. I've
talked with Percona about doing some MySQL vs. PostgreSQL head-to-heads,
but in the absence of specific funding that's unlikely to be completed.
Besides, we both really want to do head-to-heads with MSSQL & Oracle,
not each other.

> Here's the thing. I have personally seen postgres 8.1 insert 30K
> rows/second, in to a real table, on crappy hardware (single slow IDE
> drive, old crappy hardware). I would be shocked if I can't improve on
> the InnoDB numbers by at least an order of magnitude. I'm whipping
> together a personal benchmark to show this. But I need a "professional
> looking" benchmark, with pretty charts and graphs and etc., to back me up.

Don't have anything like that, I'm afraid. Last real benchmark was
SpecJAppserver from 2007 (which at the time was 90% of comparable Oracle
performance), which has long been superceded.

However, given that what you want to know about is insert performance,
it seems like it would be easy enough to mock up your own comparison.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-advocacy(at)postgresql(dot)org
Subject: Re: Probable faq: need some benchmarks of pgsql vr.s mysql
Date: 2010-10-29 20:53:25
Message-ID: 4CCB3445.3080604@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy

Brian,

Rob pointed me to a benchmark of MySQL vs. Tokutek which is interesting
to your case:

http://www.mysqlperformanceblog.com/2009/04/28/detailed-review-of-tokutek-storage-engine/

It's also interesting to me because apparently InnoDB *does* have an
issue with large numbers of inserts to an already-large table, which we
don't have (I don't think we do, anyway).

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-advocacy(at)postgresql(dot)org
Subject: Re: Probable faq: need some benchmarks of pgsql vr.s mysql
Date: 2010-10-29 20:59:25
Message-ID: 1288385965.2355.32.camel@jd-desktop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy

On Fri, 2010-10-29 at 13:53 -0700, Josh Berkus wrote:
> Brian,
>
> Rob pointed me to a benchmark of MySQL vs. Tokutek which is interesting
> to your case:
>
> http://www.mysqlperformanceblog.com/2009/04/28/detailed-review-of-tokutek-storage-engine/
>
> It's also interesting to me because apparently InnoDB *does* have an
> issue with large numbers of inserts to an already-large table, which we
> don't have (I don't think we do, anyway).

I thought InnoDB still had P/K insert/update issues. I could be cracked
(would have to dig up the article)

JD

>
> --
> -- Josh Berkus
> PostgreSQL Experts Inc.
> http://www.pgexperts.com
>

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


From: Brian Hurt <bhurt(at)spnz(dot)org>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-advocacy(at)postgresql(dot)org
Subject: Re: Probable faq: need some benchmarks of pgsql vr.s mysql
Date: 2010-10-29 21:37:19
Message-ID: alpine.DEB.2.00.1010291732020.31170@sergyar
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy


For the record, the table we're having trouble inserting into is ~100 rows
with ~50 indexes on it. E.F Codd is spinning in his grave. The reason
they went with this design (instead of one that has two tables, each with
3-6 columns, and about that many indexes) is that "joins are slow".
Which they may be on Mysql, I don't know. But this is (unfortunately) a
different battle.

Brian

On Fri, 29 Oct 2010, Josh Berkus wrote:

> Brian,
>
> Rob pointed me to a benchmark of MySQL vs. Tokutek which is interesting
> to your case:
>
> http://www.mysqlperformanceblog.com/2009/04/28/detailed-review-of-tokutek-storage-engine/
>
> It's also interesting to me because apparently InnoDB *does* have an
> issue with large numbers of inserts to an already-large table, which we
> don't have (I don't think we do, anyway).
>
> --
> -- Josh Berkus
> PostgreSQL Experts Inc.
> http://www.pgexperts.com
>
> --
> Sent via pgsql-advocacy mailing list (pgsql-advocacy(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-advocacy
>


From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Brian Hurt <bhurt(at)spnz(dot)org>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-advocacy(at)postgresql(dot)org
Subject: Re: Probable faq: need some benchmarks of pgsql vr.s mysql
Date: 2010-10-30 13:57:39
Message-ID: 4CCC2453.70107@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy

On 10/29/2010 11:37 PM, Brian Hurt wrote:
>
> For the record, the table we're having trouble inserting into is ~100
> rows with ~50 indexes on it. E.F Codd is spinning in his grave. The
> reason they went with this design (instead of one that has two tables,
> each with 3-6 columns, and about that many indexes) is that "joins are
> slow". Which they may be on Mysql, I don't know. But this is
> (unfortunately) a different battle.

is that really only 100 rows or are you actually talking about columns?
if the later you will have a very hard time getting reasonable bulk/mass
loading performance in most databases (and also pg) - a table that wide
and with a that ridiculous number of indexes is just bound to be slow.
Now I actually think that the figures you are getting from innodb are
fairly reasonable...

Stefan


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc: Brian Hurt <bhurt(at)spnz(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>, <pgsql-advocacy(at)postgresql(dot)org>
Subject: Re: Probable faq: need some benchmarks of pgsql vr.s mysql
Date: 2010-10-30 16:20:30
Message-ID: 7f7fc6c04bf8c90ee94e48032228cdc3@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy

On Sat, 30 Oct 2010 15:57:39 +0200, Stefan Kaltenbrunner
<stefan(at)kaltenbrunner(dot)cc> wrote:
> On 10/29/2010 11:37 PM, Brian Hurt wrote:
>>
>> For the record, the table we're having trouble inserting into is ~100
>> rows with ~50 indexes on it. E.F Codd is spinning in his grave. The
>> reason they went with this design (instead of one that has two tables,
>> each with 3-6 columns, and about that many indexes) is that "joins are
>> slow". Which they may be on Mysql, I don't know. But this is
>> (unfortunately) a different battle.

As opposed to the slowness you are having now?

/me smacks your developers for you

JD

--
PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org
Consulting, Development, Support, Training
503-667-4564 - http://www.commandprompt.com/
The PostgreSQL Company, serving since 1997


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Brian Hurt <bhurt(at)spnz(dot)org>
Cc: pgsql-advocacy(at)postgresql(dot)org
Subject: Re: Probable faq: need some benchmarks of pgsql vr.s mysql
Date: 2010-10-30 20:24:49
Message-ID: 4CCC7F11.5080200@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy

On 10/29/10 2:37 PM, Brian Hurt wrote:
>
> For the record, the table we're having trouble inserting into is ~100
> rows with ~50 indexes on it. E.F Codd is spinning in his grave. The
> reason they went with this design (instead of one that has two tables,
> each with 3-6 columns, and about that many indexes) is that "joins are
> slow". Which they may be on Mysql, I don't know. But this is
> (unfortunately) a different battle.

Not sure that that'll be any better on Postgres then. Few DBMSes
optimize for "stupid design". Maybe they should go with a "NoSQL"
database.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Robert Treat <rob(at)xzilla(dot)net>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-advocacy(at)postgresql(dot)org
Subject: Re: Probable faq: need some benchmarks of pgsql vr.s mysql
Date: 2010-10-30 21:03:22
Message-ID: AANLkTi=YK4+7nbq0J0NQu23GXW9837sHUT==CxjcwxRO@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy

On Fri, Oct 29, 2010 at 1:25 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:

> Brian,
>
>
> My company is currently hitting a problem with mysql/innodb having
>> really slow insert performance (we're seeing ~1K rows/second).
>>
>
> As others have said, there's clearly something wrong with your setup and/or
> application which would likely affect PostgreSQL as well. I can name several
> ways in which PostgreSQL is better than MySQL/InnoDB, but insert performance
> isn't one of them.
>
>
> My boss
>> wants to go and spend a bunch of money on the Tokutek backend.
>>
>
> I'm sure that Tokutek would be thrilled to have a customer (they're very
> new). Again, though, insert performance isn't Tokutek's specialty; the
> arguments for using them is that (a) they're not owned by oracle,

<snip>

That seems like a really weak argument, unless Tokutek has come out with a
stand-alone product I'm not aware of. At best you'll still need to deal with
Oracle to get the MySQL bits, at worse Tokutek's technology is as good as it
sounds and they are not an acquisition target for Oracle. (Assuming you
don't want to deal with Oracle, which you're reasoning above implies. Many
people won't care about that though).

>
> > I've googled around for a while, but all the benchmarks I've found
>
>> commit one or more "fatal flaws", which render the benchmark pointless
>> at best:
>>
>
> Doing real benchmarks is an involved, expensive process, and both DBMSes
> are fast-moving targets which need to be benchmarked every year. I've
> talked with Percona about doing some MySQL vs. PostgreSQL head-to-heads, but
> in the absence of specific funding that's unlikely to be completed.
> Besides, we both really want to do head-to-heads with MSSQL & Oracle, not
> each other.
>
>
Actually in this case, "real" benchmarks should be pretty easy. You already
have a running application that you have numbers on. I'd say swap in a copy
of xtradb, a copy of tokutek's tech, and a copy of Postgres (after all, it's
only "5 minutes" to change the queries around), and benchmark all three.
Realistically it would probably take a dedicated week to do it, but you're
probably making a 3 year decision, so this bit of time up front seems like a
good way to go. (And don't forget to publish your results afterwards).

Robert Treat
play: http://www.xzilla.net
work: http://www.omniti.com/is/hiring


From: Brian Hurt <bhurt(at)spnz(dot)org>
To: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-advocacy(at)postgresql(dot)org
Subject: Re: Probable faq: need some benchmarks of pgsql vr.s mysql
Date: 2010-10-30 23:44:04
Message-ID: alpine.DEB.2.00.1010301943240.31170@sergyar
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy

On Sat, 30 Oct 2010, Stefan Kaltenbrunner wrote:

> On 10/29/2010 11:37 PM, Brian Hurt wrote:
>>
>> For the record, the table we're having trouble inserting into is ~100
>> rows with ~50 indexes on it. E.F Codd is spinning in his grave. The
>> reason they went with this design (instead of one that has two tables,
>> each with 3-6 columns, and about that many indexes) is that "joins are
>> slow". Which they may be on Mysql, I don't know. But this is
>> (unfortunately) a different battle.
>
> is that really only 100 rows or are you actually talking about columns?

Bleh, I meant columns.

100 rows is nothing.

> if
> the later you will have a very hard time getting reasonable bulk/mass loading
> performance in most databases (and also pg) - a table that wide and with a
> that ridiculous number of indexes is just bound to be slow. Now I actually
> think that the figures you are getting from innodb are fairly reasonable...
>
>
> Stefan
>

Brian


From: MARK CALLAGHAN <mdcallag(at)gmail(dot)com>
To: Brian Hurt <bhurt(at)spnz(dot)org>
Cc: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-advocacy(at)postgresql(dot)org
Subject: Re: Probable faq: need some benchmarks of pgsql vr.s mysql
Date: 2010-11-01 15:23:35
Message-ID: AANLkTimq718qCu=LzEdVQxe0tey7Hm+KZeH4dXXSfkuz@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-advocacy

The "insert buffer" in InnoDB accelerates this workload. It buffers
changes in a special b-tree to avoid disk IO during secondary index
maintenance. For my workloads the special b-tree is able to capture
multiple changes to blocks and is likely reduce the IO requirements
for the application. Even without that benefit it allows the server to
absorb workload spikes as the disk reads for secondary index
maintenance are deferred.

This is done for inserts in MySQL 5.1 and for inserts, updates and
deletes in MySQL 5.5. This won't allow InnoDB to match TokuDB in
performance, but it should provide much better throughput than you
would expect from an engine that does update in place.

http://www.google.com/search?hl=en&q=insert+buffer+innodb

On Sat, Oct 30, 2010 at 4:44 PM, Brian Hurt <bhurt(at)spnz(dot)org> wrote:
>
>
> On Sat, 30 Oct 2010, Stefan Kaltenbrunner wrote:
>
>> On 10/29/2010 11:37 PM, Brian Hurt wrote:
>>>
>>> For the record, the table we're having trouble inserting into is ~100
>>> rows with ~50 indexes on it. E.F Codd is spinning in his grave. The
>>> reason they went with this design (instead of one that has two tables,
>>> each with 3-6 columns, and about that many indexes) is that "joins are
>>> slow". Which they may be on Mysql, I don't know. But this is
>>> (unfortunately) a different battle.
>>
>> is that really only 100 rows or are you actually talking about columns?
>
> Bleh, I meant columns.
>
> 100 rows is nothing.
>
>> if the later you will have a very hard time getting reasonable bulk/mass
>> loading performance in most databases (and also pg) - a table that wide and
>> with a that ridiculous number of indexes is just bound to be slow. Now I
>> actually think that the figures you are getting from innodb are fairly
>> reasonable...
>>
>>
>> Stefan
>>
>
> Brian
>
>
> --
> Sent via pgsql-advocacy mailing list (pgsql-advocacy(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-advocacy
>

--
Mark Callaghan
mdcallag(at)gmail(dot)com