Re: Performance comparison to psql.

Lists: pgsql-jdbc
From: Arie Ozarov <aozarov(at)hi5(dot)com>
To: <pgsql-jdbc(at)postgresql(dot)org>
Subject: Performance comparison to psql.
Date: 2008-02-05 21:42:36
Message-ID: C3CE184C.E6B%aozarov@hi5.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

We compared insert & select operations between jdbc (postgres 8.2,
postgresql-8.2-506.jdbc3.jar) and psql.

Here are some numbers:

PSQL *COPY* from STDIN
recordsCount = 10 000 000

WITHOUT INDEXES: time = 50544 ms (*0.005 ms* per record)
WITH 2 INDEXES: time = 221491 ms (*0.022 ms* per record)

*JDBC (prepared statement without batch)*
recordsCount = 100 000

WITHOUT INDEXES: time = 64874 ms (*0.649 **ms* per record)
WITH 2 INDEXES: time = 63057 ms (*0.630 ms* per record)

*JDBC (**prepared statement with batch**)*
recordsCount = 1 000 000

WITHOUT INDEXES: time = 73205 ms (*0.073** ms* per record)
WITH 2 INDEXES: time = 100270 ms (*0.100 ms* per record)

Comparison table (records inserted per millisecond)
COPY JDBC JDBC batch
WITHOUT INDEXES: 198 1.5 14
WITH 2 INDEXES: 45 1.5 10

As for select/queries psql was about 3 times faster.
Both psql and jdbc operations were done remotely from the same machine.

I understand that JDBC has some overhead (object translation,..) but didn't
think the difference would be that big. Do this numbers look correct (any
optimization suggestion?)

Any performance improvement in postgresql-8.2-507.jdbc4.jar?

Is the copy operation much more optimized than inserts (and if so when/will
the driver support it)?

Thanks,
Arie.


From: Kris Jurka <books(at)ejurka(dot)com>
To: Arie Ozarov <aozarov(at)hi5(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Performance comparison to psql.
Date: 2008-02-05 22:31:54
Message-ID: Pine.BSO.4.64.0802051727210.12452@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Tue, 5 Feb 2008, Arie Ozarov wrote:

> I understand that JDBC has some overhead (object translation,..) but didn't
> think the difference would be that big. Do this numbers look correct (any
> optimization suggestion?)

The real cost is the protocol level overhead of INSERT vs COPY. JDBC
batch execution groups things together to reduce the number of network
round trips, but it still has to send each insert as an individual request
to the server.

> Any performance improvement in postgresql-8.2-507.jdbc4.jar?
>

No.

> Is the copy operation much more optimized than inserts (and if so when/will
> the driver support it)?
>

Yes, copy is significantly faster than insert. If you'd like, construct a
psql test case that does 100,000 individual inserts and you'll see it's
not just a JDBC driver/libpq difference.

Copy support is available using this patched driver, but it has not been
integrated into the official version yet.

http://kato.iki.fi/sw/db/postgresql/jdbc/copy/

Kris Jurka


From: Arie Ozarov <aozarov(at)hi5(dot)com>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Performance comparison to psql.
Date: 2008-02-05 23:09:07
Message-ID: C3CE2C93.E79%aozarov@hi5.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Would it be better than to group inserts using standard statement (not
prepared - or does it really matter) this way:
insert into T values (set1), (set2),..,(setN); ?

Any reason for a select statement to be 3 times slower?

When is it planned to include the copy support in the official version?

Thanks!
Arie.

On 2/5/08 2:31 PM, "Kris Jurka" <books(at)ejurka(dot)com> wrote:

>
>
> On Tue, 5 Feb 2008, Arie Ozarov wrote:
>
>> I understand that JDBC has some overhead (object translation,..) but didn't
>> think the difference would be that big. Do this numbers look correct (any
>> optimization suggestion?)
>
> The real cost is the protocol level overhead of INSERT vs COPY. JDBC
> batch execution groups things together to reduce the number of network
> round trips, but it still has to send each insert as an individual request
> to the server.
>
>> Any performance improvement in postgresql-8.2-507.jdbc4.jar?
>>
>
> No.
>
>> Is the copy operation much more optimized than inserts (and if so when/will
>> the driver support it)?
>>
>
> Yes, copy is significantly faster than insert. If you'd like, construct a
> psql test case that does 100,000 individual inserts and you'll see it's
> not just a JDBC driver/libpq difference.
>
> Copy support is available using this patched driver, but it has not been
> integrated into the official version yet.
>
> http://kato.iki.fi/sw/db/postgresql/jdbc/copy/
>
> Kris Jurka
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: Arie Ozarov <aozarov(at)hi5(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Performance comparison to psql.
Date: 2008-02-05 23:09:25
Message-ID: 29358.1202252965@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Kris Jurka <books(at)ejurka(dot)com> writes:
> On Tue, 5 Feb 2008, Arie Ozarov wrote:
>> I understand that JDBC has some overhead (object translation,..) but didn't
>> think the difference would be that big. Do this numbers look correct (any
>> optimization suggestion?)

> The real cost is the protocol level overhead of INSERT vs COPY.

Also, if you were inserting only one row per INSERT command, there's a
significant statement startup/shutdown overhead in the server, even for
a prepared statement. I don't see any reason to think that these
numbers are JDBC's fault --- it's just a fact of life that COPY is
a lot more efficient than a series of INSERTs. (If it were not, we'd
hardly even bother having it.)

regards, tom lane