DBD::Pg performance bites compared to DBD::mysql?

Lists: pgsql-interfaces
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-interfaces(at)postgreSQL(dot)org
Subject: DBD::Pg performance bites compared to DBD::mysql?
Date: 2005-06-03 05:17:16
Message-ID: 1015.1117775836@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-interfaces

In my copious free time ;-), I've been poking at MySQL's "sql-bench"
benchmark and trying to understand why it makes us look so bad.
There are a number of things involved, but one item that I just realized
tonight is that a pretty substantial part of the problem is on the
client side. The test driver is written in Perl and depends on DBI/DBD
to connect to the database. Here is a Perl Devel::DProf trace for a
run of 300000 simple INSERT commands (plus a few CREATE TABLE and other
ilk):

PG:

Total Elapsed Time = 231.6619 Seconds
User+System Time = 132.5019 Seconds
Inclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
104. 4.140 138.81 300007 0.0000 0.0005 DBI::db::do
101. 27.09 134.67 300007 0.0001 0.0004 DBD::Pg::db::do
44.2 3.050 58.638 300008 0.0000 0.0002 DBI::db::prepare
41.9 9.660 55.589 300008 0.0000 0.0002 DBD::Pg::db::prepare
29.9 8.270 39.629 300008 0.0000 0.0001 DBI::_new_sth
26.5 35.16 35.160 300008 0.0001 0.0001 DBI::st::execute
23.6 14.17 31.359 300013 0.0000 0.0001 DBI::_new_handle
11.0 14.66 14.660 300013 0.0000 0.0000 DBI::_setup_handle
4.75 6.300 6.300 300008 0.0000 0.0000 DBD::Pg::st::_prepare
4.75 6.300 6.300 300004 0.0000 0.0000 DBI::st::rows
4.04 5.350 5.350 600016 0.0000 0.0000 DBI::st::DESTROY
1.91 2.530 2.530 300013 0.0000 0.0000 DBI::st::TIEHASH
1.61 2.130 2.130 300011 0.0000 0.0000 DBD::_mem::common::DESTROY
0.14 0.109 0.188 7 0.0156 0.0268 main::BEGIN
0.02 0.020 0.030 1 0.0199 0.0297 DBI::install_driver

MySQL:

Total Elapsed Time = 115.0148 Seconds
User+System Time = 31.19480 Seconds
Inclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
69.1 21.57 21.570 300006 0.0001 0.0001 DBI::db::do
0.57 0.099 0.177 7 0.0142 0.0254 main::BEGIN
0.06 0.020 0.020 6 0.0033 0.0033 DynaLoader::dl_load_file
0.06 - 0.020 5 - 0.0040 Cwd::BEGIN
0.06 - 0.020 3 - 0.0066 DynaLoader::bootstrap
0.06 0.010 0.020 1 0.0100 0.0197 DBI::install_driver
0.06 0.010 0.020 12 0.0008 0.0016 DBI::BEGIN
0.06 - 0.020 2 - 0.0098 DBI::connect
0.06 - 0.019 2 - 0.0097 db_MySQL::connect
0.06 - 0.019 1 - 0.0194 db_MySQL::version
0.03 0.010 0.010 2 0.0050 0.0050 AutoLoader::import
0.03 - 0.010 2 - 0.0050 POSIX::BEGIN
0.03 - 0.010 1 - 0.0100 POSIX::import
0.03 0.010 0.010 10 0.0010 0.0010 vars::import
0.03 - 0.010 1 - 0.0100 Benchmark::import

The server-side time isn't all that much different: about 100 sec for
us, 85 for them. But there is something way wrong on the Perl side.

I'm not any kind of Perl module guru, but it looks to me like the
explanation for the discrepancy is that much of DBD::mysql is written
in C while the equivalent code in DBD::Pg is written in Perl. This is
killing us for any app written in Perl :-( --- the above trace only
shows a factor of 2 total penalty, but I've seen up to a factor of 4 in
other cases. There is no way that a client-side driver should be taking
several times longer than the backend to process a SQL command :-(

Can anyone who knows more about Perl confirm or deny?
Anyone want to step up to fix it?

regards, tom lane


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-interfaces(at)postgresql(dot)org
Subject: Re: DBD::Pg performance bites compared to DBD::mysql?
Date: 2005-06-05 02:27:55
Message-ID: 87ekbhk05w.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-interfaces


Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> I'm not any kind of Perl module guru, but it looks to me like the
> explanation for the discrepancy is that much of DBD::mysql is written
> in C while the equivalent code in DBD::Pg is written in Perl. This is
> killing us for any app written in Perl :-( --- the above trace only
> shows a factor of 2 total penalty, but I've seen up to a factor of 4 in
> other cases. There is no way that a client-side driver should be taking
> several times longer than the backend to process a SQL command :-(

What version of DBD::Pg did you test with? Development was pretty moribund for
a while but has picked up recently, just in time for 8.0 and the new binary
prepared query support.

Incidentally, the DBD::Pg mailing list is dbdpg-general(at)gborg(dot)postgresql(dot)org

--
greg


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: pgsql-interfaces(at)postgresql(dot)org
Subject: Re: DBD::Pg performance bites compared to DBD::mysql?
Date: 2005-06-05 04:40:43
Message-ID: 815.1117946443@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-interfaces

Greg Stark <gsstark(at)mit(dot)edu> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> ... There is no way that a client-side driver should be taking
>> several times longer than the backend to process a SQL command :-(

> What version of DBD::Pg did you test with?

DBD::Pg 1.40, also perl-DBI-1.40 and perl-5.8.5. Greg Mullane
promised he'd take a closer look after he gets done moving to
Florida... in the meantime, I have a self-contained test case
if someone else wants to dig into it.

(Greg indicated that he thought the coding style in the Perl
test script pretty well sucked, which I don't doubt, but the
question remains why we are so much slower than MySQL on the
same code.)

regards, tom lane


From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-interfaces(at)postgresql(dot)org
Subject: Re: DBD::Pg performance bites compared to DBD::mysql?
Date: 2005-06-14 03:04:03
Message-ID: 933399e2f40b843429889b52d6ebae8d@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-interfaces


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


>> the above trace only shows a factor of 2 total penalty, but I've
>> seen up to a factor of 4 in other cases. There is no way that a
>> client-side driver should be taking several times longer than the
>> backend to process a SQL command :-(

> What version of DBD::Pg did you test with?

This is a problem with the newer versions as well. Maybe even worse,
as we had to do some contortions when the PQprepare 7.4/8.0 mess
appeared. However, we can definitely speed up that part of the code.
I'll see about doing this soon.

I still maintain the original test is brain-dead, however. :) It was
testing for "insert speed" by basically doing this:

for (1..1000) {
$dbh->do("INSERT INTO testtable(a,b) VALUES (1,2)");
}

It's as if the people who wrote it *cough*mysql*cough* never heard
of prepare/execute and did not read the DBI docs.

It's very similar to the script on this page:

http://www.innodb.com/bench.php

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
PGP Key: 0x14964AC8 200506121438
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iD8DBQFCrILHvJuQZxSWSsgRAkqnAKCoU1XAfE9nsQf11U0hM2P7Vt/uOACfSXmN
Dt9h3ACXGLEAXiIdjcXMLRg=
=8fiG
-----END PGP SIGNATURE-----


From: Greg Stark <gsstark(at)mit(dot)edu>
To: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
Cc: pgsql-interfaces(at)postgresql(dot)org
Subject: Re: DBD::Pg performance bites compared to DBD::mysql?
Date: 2005-06-14 06:44:28
Message-ID: 87u0k11lpv.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-interfaces

"Greg Sabino Mullane" <greg(at)turnstep(dot)com> writes:

> I still maintain the original test is brain-dead, however. :) It was
> testing for "insert speed" by basically doing this:
>
> for (1..1000) {
> $dbh->do("INSERT INTO testtable(a,b) VALUES (1,2)");
> }
>
> It's as if the people who wrote it *cough*mysql*cough* never heard
> of prepare/execute and did not read the DBI docs.

Sure but wouldn't that show up as more time spent in the server in the parser
and planner? I thought Tom was complaining that it was spending too much time
in the client not waiting on server responses.

I wonder if Postgres is being bitten by doing lots of extra round trips. Maybe
the MySQL driver detects when there are no placeholders and no parameters and
uses a different protocol method that only requires a single round trip? Or
given your example maybe it even notices no result is needed and doesn't even
wait for a response?

--
greg


From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-interfaces(at)postgresql(dot)org
Cc: gsstark(at)mit(dot)edu
Subject: Re: DBD::Pg performance bites compared to DBD::mysql?
Date: 2005-06-15 02:17:44
Message-ID: 8b2e8b9f285b6431df744927947cdfd9@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-interfaces


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> I wonder if Postgres is being bitten by doing lots of extra round trips. Maybe
> the MySQL driver detects when there are no placeholders and no parameters and
> uses a different protocol method that only requires a single round trip?

We're not making any extra round trips per se, but we are not taking advantage of
the fact that we can know up front that the statement will not take any
parameters, and we can avoid parsing it for placeholders. My current idea is to treat
do(statement,args) as it currently is (sent to prepare/execute using PQexec (<7.4) or
PQexecParams (newer servers)), but to make a special fast path for simple do(statement)
that basically just throws the whole thing in a quick PQexec and returns. We've been
doing a lot of rewriting of stuff to support the new 7.4/8.0 v3 protocol stuff
such as PQprepare: now we can start the optimizing. :)

> Or given your example maybe it even notices no result is needed and doesn't even
> wait for a response?

That would be a neat trick, but they are not doing that. I don't even think it's
currently possible to derive that information via DBI the way things are now. Would
make a nice long-term micro-optimization though. :)

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
PGP Key: 0x14964AC8 200506142215
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-----BEGIN PGP SIGNATURE-----

iD8DBQFCr4+uvJuQZxSWSsgRApw5AJ9g3tQfihVWj6yOV0sHp5/Md15r4QCgps0Y
f1j8nPU6EqioSh20tckBsE0=
=X4WB
-----END PGP SIGNATURE-----


From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-interfaces(at)postgresql(dot)org
Subject: Re: DBD::Pg performance bites compared to DBD::mysql?
Date: 2005-08-22 00:20:00
Message-ID: e8701beb659e7207431e08c37b69bcaf@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-interfaces


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> I'm not any kind of Perl module guru, but it looks to me like the
> explanation for the discrepancy is that much of DBD::mysql is written
> in C while the equivalent code in DBD::Pg is written in Perl. This is
> killing us for any app written in Perl :-( --- the above trace only
> shows a factor of 2 total penalty, but I've seen up to a factor of 4 in
> other cases. There is no way that a client-side driver should be taking
> several times longer than the backend to process a SQL command :-(
>
> Can anyone who knows more about Perl confirm or deny?
> Anyone want to step up to fix it?

I can't imagine that it was causing a factor of 2 difference, but I've
changed the code in DBD::Pg to make do() a lot smarter, for brain-dead
benchmarks such as MySQL's that use do() instead of prepare/execute
like they should. In short, DBD::Pg will short-circuit a lot of the usual
work and call PQexec directly if it can get away with it, which should
make the profile Tom posted look a lot better. It also avoids overhead of
going through Pg.pm, and is all C/XS code now. Hopefully, this should
make the benchmarking results a little more fair. :) This is now in
cvs, and will be part of the upcoming version 1.44 of DBD::Pg.

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
PGP Key: 0x14964AC8 200508212016
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iD8DBQFDCRoBvJuQZxSWSsgRAlb7AJ9IfY1fmxRoib1Bct4HCHRi6WoCCwCfQjmk
oqhEPn8SzwcEe9SouwEUD2s=
=eKgZ
-----END PGP SIGNATURE-----


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
Cc: pgsql-interfaces(at)postgresql(dot)org
Subject: Re: DBD::Pg performance bites compared to DBD::mysql?
Date: 2005-08-22 01:25:56
Message-ID: 5495.1124673956@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-interfaces

"Greg Sabino Mullane" <greg(at)turnstep(dot)com> writes:
> I can't imagine that it was causing a factor of 2 difference, but I've
> changed the code in DBD::Pg to make do() a lot smarter, for brain-dead
> benchmarks such as MySQL's that use do() instead of prepare/execute
> like they should. In short, DBD::Pg will short-circuit a lot of the usual
> work and call PQexec directly if it can get away with it, which should
> make the profile Tom posted look a lot better. It also avoids overhead of
> going through Pg.pm, and is all C/XS code now. Hopefully, this should
> make the benchmarking results a little more fair. :) This is now in
> cvs, and will be part of the upcoming version 1.44 of DBD::Pg.

Cool. Thanks for doing that. Do you have an idea when 1.44 will
be out?

regards, tom lane


From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-interfaces(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: DBD::Pg performance bites compared to DBD::mysql?
Date: 2005-08-22 01:47:27
Message-ID: a506413d3915097ec54422522fbe4a18@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-interfaces


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> Cool. Thanks for doing that. Do you have an idea when 1.44 will
> be out?

Not really, I'm afraid. I was tracking down a quoting issue this weekend
and have now gone down a twisty path that induced ripping out and
rewriting most of the "type" and quoting code, which is now lying in
pieces in my work directory. :) If I get the time to finish it soon, maybe
a release in two weeks or so. Otherwise, I might just back out the changes
and release the new do() as 1.44.

And if everyone reading this who uses DBD::Pg could test the code in cvs
right now, that would be greatly appreciated: there were some substantial
changes that I'd love to catch any bugs on before rather than after the
next release. I'll even roll up a tarball for your testing pleasure:

http://www.gtsm.com/DBD-Pg-1.43_1.tar.gz

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
PGP Key: 0x14964AC8 200508212142
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iD8DBQFDCS6KvJuQZxSWSsgRAkzKAKCvCzjpvpzVSdqCrL4VLy0vkDj5rwCgwYz6
TNfaqozZ4aUUuaY4YeuFGsI=
=YpRL
-----END PGP SIGNATURE-----