Re: bytea memory improvement - test results

Lists: pgsql-jdbc
From: Luis Vilar Flores <lflores(at)evolute(dot)pt>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: bytea memory improvement
Date: 2006-08-22 23:00:18
Message-ID: 44EB8C82.1000203@evolute.pt
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc


Hello,

This time I believe to have all tests and source needed to have the
patch accepted.
To all that already forgot the first emails, I developed an
modified version of the method toBytes from the
org.postgresql.util.PGbytea class.
The old method uses 3 buffers to translate the data from the nework
to the client, this uses too much memory.
My method only uses 2 buffers, but does one more pass through the
original buffer (to calculate it's final size).

Bellow is a table with times and memory usage of the 2 methods,
using the supplied ByteaTest class:
OLD method:
size: 0.5MB execute+next: 49ms getBytes: 18ms used mem: 74505KB
size: 1.5MB execute+next: 94ms getBytes: 53ms used mem: 48004KB
size: 2.5MB execute+next: 147ms getBytes: 110ms used mem: 23537KB
size: 3.5MB execute+next: 244ms getBytes: 190ms used mem: 24504KB
size: 4.5MB execute+next: 306ms getBytes: 224ms used mem: 31448KB
size: 5.5MB execute+next: 364ms getBytes: 267ms used mem: 38392KB
size: 6.5MB execute+next: 413ms getBytes: 308ms used mem: 45336KB
size: 7.5MB execute+next: 464ms getBytes: 306ms used mem: 52281KB
size: 8.5MB execute+next: 511ms getBytes: 349ms used mem: 59225KB
size: 9.5MB execute+next: 804ms getBytes: 377ms used mem: 66169KB
size: 10.5MB execute+next: 634ms getBytes: 546ms used mem: 73112KB
size: 11.5MB execute+next: 689ms getBytes: 450ms used mem: 80057KB
size: 12.5MB execute+next: 748ms getBytes: 482ms used mem: 87001KB
size: 13.5MB execute+next: 820ms getBytes: 514ms used mem: 93945KB
size: 14.5MB execute+next: 865ms getBytes: 734ms used mem: 100888KB
size: 15.5MB execute+next: 921ms getBytes: 586ms used mem: 107833KB
size: 16.5MB execute+next: 1003ms getBytes: 619ms used mem: 114777KB
size: 17.5MB execute+next: 1030ms getBytes: 652ms used mem: 121721KB
size: 18.5MB execute+next: 1102ms getBytes: 927ms used mem: 128664KB
size: 19.5MB execute+next: 1166ms getBytes: 723ms used mem: 135609KB
size: 20.5MB execute+next: 1217ms getBytes: 735ms used mem: 142583KB
size: 21.5MB execute+next: 1284ms getBytes: 766ms used mem: 149527KB
size: 22.5MB execute+next: 1437ms getBytes: 801ms used mem: 156471KB
size: 23.5MB execute+next: 1425ms getBytes: 833ms used mem: 163415KB
size: 24.5MB execute+next: 1453ms getBytes: 866ms used mem: 170359KB
size: 25.5MB execute+next: 1766ms getBytes: 902ms used mem: 177303KB
size: 26.5MB execute+next: 2004ms getBytes: 939ms used mem: 184247KB
size: 27.5MB execute+next: 1650ms getBytes: 968ms used mem: 191191KB
size: 28.5MB execute+next: 1757ms getBytes: 796ms used mem: 198105KB
size: 29.5MB execute+next: 1770ms getBytes: 1040ms used mem: 205086KB
size: 30.5MB execute+next: 1820ms getBytes: 1074ms used mem: 212030KB
size: 31.5MB execute+next: 1869ms getBytes: 1109ms used mem: 218974KB
size: 32.5MB execute+next: 1930ms getBytes: 1146ms used mem: 225918KB
size: 33.5MB execute+next: 2183ms getBytes: 1177ms used mem: 232862KB
size: 34.5MB execute+next: 2241ms getBytes: 1221ms used mem: 239806KB

NEW method:
size: 0.5MB execute+next: 50ms getBytes: 19ms used mem: 73137KB
size: 1.5MB execute+next: 90ms getBytes: 50ms used mem: 43760KB
size: 2.5MB execute+next: 149ms getBytes: 97ms used mem: 16136KB
size: 3.5MB execute+next: 237ms getBytes: 113ms used mem: 14170KB
size: 4.5MB execute+next: 302ms getBytes: 174ms used mem: 18127KB
size: 5.5MB execute+next: 357ms getBytes: 234ms used mem: 22110KB
size: 6.5MB execute+next: 602ms getBytes: 232ms used mem: 26095KB
size: 7.5MB execute+next: 477ms getBytes: 265ms used mem: 30079KB
size: 8.5MB execute+next: 532ms getBytes: 296ms used mem: 34063KB
size: 9.5MB execute+next: 590ms getBytes: 385ms used mem: 38046KB
size: 10.5MB execute+next: 648ms getBytes: 357ms used mem: 42031KB
size: 11.5MB execute+next: 695ms getBytes: 391ms used mem: 46015KB
size: 12.5MB execute+next: 765ms getBytes: 423ms used mem: 49999KB
size: 13.5MB execute+next: 825ms getBytes: 542ms used mem: 53982KB
size: 14.5MB execute+next: 874ms getBytes: 491ms used mem: 57967KB
size: 15.5MB execute+next: 931ms getBytes: 521ms used mem: 61951KB
size: 16.5MB execute+next: 992ms getBytes: 551ms used mem: 65935KB
size: 17.5MB execute+next: 1063ms getBytes: 694ms used mem: 69918KB
size: 18.5MB execute+next: 1111ms getBytes: 618ms used mem: 73903KB
size: 19.5MB execute+next: 1168ms getBytes: 649ms used mem: 77887KB
size: 20.5MB execute+next: 1230ms getBytes: 654ms used mem: 81903KB
size: 21.5MB execute+next: 1289ms getBytes: 687ms used mem: 85890KB
size: 22.5MB execute+next: 1345ms getBytes: 737ms used mem: 89875KB
size: 23.5MB execute+next: 1415ms getBytes: 751ms used mem: 93861KB
size: 24.5MB execute+next: 1461ms getBytes: 782ms used mem: 97846KB
size: 25.5MB execute+next: 1521ms getBytes: 817ms used mem: 101833KB
size: 26.5MB execute+next: 1587ms getBytes: 848ms used mem: 105817KB
size: 27.5MB execute+next: 1634ms getBytes: 877ms used mem: 109804KB
size: 28.5MB execute+next: 1692ms getBytes: 931ms used mem: 113789KB
size: 29.5MB execute+next: 1748ms getBytes: 944ms used mem: 117775KB
size: 30.5MB execute+next: 1820ms getBytes: 972ms used mem: 121760KB
size: 31.5MB execute+next: 1869ms getBytes: 1005ms used mem: 125747KB
size: 32.5MB execute+next: 1915ms getBytes: 1038ms used mem: 129731KB
size: 33.5MB execute+next: 1983ms getBytes: 1088ms used mem: 133718KB
size: 34.5MB execute+next: 2055ms getBytes: 1103ms used mem: 137703KB

As you can see the execution time remained almost the same (small
gain on the new version), but memory usage is drastically improved.

These times were obtained in a Celeron M 1.6GHz laptop with 1GB RAM,
running Fedora Core 5, Java 1.5.0_08 and Postgresql 8.1.4.

In attach I supply the modified PGbytea.java, the patch versus
8.1-407 source (the 8.2dev-503 is the same), and the test program
ByteaTest.java.

The test program also validates the correctness of the result
through CRC32.

Hope to hear some feedback soon, hope I didn't forget anything ...

Luis Flores

Attachment Content-Type Size
ByteaTest.java text/x-java 3.7 KB
PGbytea.java text/x-java 4.0 KB
PGBytea.patch text/x-patch 1.7 KB

From: till toenges <tt(at)kyon(dot)de>
To: Luis Vilar Flores <lflores(at)evolute(dot)pt>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: bytea memory improvement
Date: 2006-08-23 14:31:12
Message-ID: 44EC66B0.2040108@kyon.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Luis Vilar Flores wrote:
> Hope to hear some feedback soon, hope I didn't forget anything ...

I have an idea for a minor improvement. The MAX_3_BUFF_SIZE is set to 0.
Actually, you can immediately return an empty byte array if the size of
the incomming buffer is 0; that could be a static final byte[], because
nobody could do anything with it anyway. In all other cases, the 2
buffer method is simpler and faster, because it uses fewer buffers and
memory accesses, and is therefore the right solution.

Till


From: Luis Vilar Flores <lflores(at)evolute(dot)pt>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: bytea memory improvement
Date: 2006-08-23 14:48:05
Message-ID: 44EC6AA5.10205@evolute.pt
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
<title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
till toenges wrote:
<blockquote cite="mid44EC66B0(dot)2040108(at)kyon(dot)de" type="cite">
<pre wrap="">Luis Vilar Flores wrote:
</pre>
<blockquote type="cite">
<pre wrap=""> Hope to hear some feedback soon, hope I didn't forget anything ...
</pre>
</blockquote>
<pre wrap=""><!---->
I have an idea for a minor improvement. The MAX_3_BUFF_SIZE is set to 0.
Actually, you can immediately return an empty byte array if the size of
the incomming buffer is 0; that could be a static final byte[], because
nobody could do anything with it anyway. In all other cases, the 2
buffer method is simpler and faster, because it uses fewer buffers and
memory accesses, and is therefore the right solution.

Till

</pre>
</blockquote>
The MAX_3_BUFF_SIZE can be deleted (and the test that use it too), it
was these so that we can set a size threshold to use 3 buffers (old
algorithm), or only 2 (at the beginning it seemed that 2 buffers were
slower).<br>
<br>
If the incoming size is 0 we can use the incoming array, I tried to
only change the buffer algorithm, the null case for instance was
already there.<br>
<br>
Thanks for the comments,<br>
<br>
<div class="moz-signature">-- <br>
<meta http-equiv="CONTENT-TYPE" content="text/html; ">
<title>Evolute - Luis Flores</title>
<p><font color="#7da647"><font face="Verdana, sans-serif"><font
style="font-size: 10pt;" size="2"> Luis Flores
</font></font></font></p>
<p>
<font color="#7da647"><font face="Verdana, sans-serif"><font
style="font-size: 8pt;" size="2"> Analista de Sistemas</font></font></font></p>
<p><a href="http://www.evolute.pt"><font face="Verdana, sans-serif"><font
style="font-size: 8pt;" size="2"><b>Evolute</b> - Consultoria
Inform&aacute;tica<br>
<br>
</font></font></a>
<font color="#7da647"><font face="Verdana, sans-serif"><font
style="font-size: 8pt;" size="2"> Email: </font></font></font>
<a href="mailto:lflores(at)evolute(dot)pt"><font face="Verdana, sans-serif"><font
style="font-size: 8pt;" size="2">lflores(at)evolute(dot)pt
</font></font></a></p>
<p>
<font color="#7da647"><font face="Verdana, sans-serif"><font
style="font-size: 8pt;" size="2"> Tel: (+351)
212949689</font></font></font></p>
<div style="text-align: justify;"><font color="#7d7d7d"><font
face="Verdana, sans-serif"><font style="font-size: 7pt;" size="1">
<br>
AVISO DE CONFIDENCIALIDADE</font></font></font><br>
<font color="#7d7d7d"><font face="Verdana, sans-serif"><font
style="font-size: 7pt;" size="1">
Esta mensagem de correio electr&oacute;nico e eventuais ficheiros
anexos s&atilde;o confidenciais e destinados apenas &agrave;(s)
pessoa(s) ou entidade(s) acima referida(s),
podendo conter informa&ccedil;&atilde;o privilegiada e
confidencial, a qual n&atilde;o poder&aacute; ser divulgada,
copiada, gravada ou distribu&iacute;da nos termos da lei vigente.
Caso n&atilde;o
seja o destinat&aacute;rio da mensagem, ou se ela lhe foi enviada
por engano, agradecemos que n&atilde;o fa&ccedil;a uso ou
divulga&ccedil;&atilde;o da mesma. A
distribui&ccedil;&atilde;o ou
utiliza&ccedil;&atilde;o da informa&ccedil;&atilde;o
nela contida &eacute; interdita. Se recebeu esta mensagem por
engano, por favor notifique o remetente e apague este e-mail do seu
sistema.
Obrigado.
<br>
</font></font></font><font color="#7d7d7d"><font
face="Verdana, sans-serif"><font style="font-size: 7pt;" size="1">
</font></font></font><br>
<font color="#7d7d7d"><font face="Verdana, sans-serif"><font
style="font-size: 7pt;" size="1">
CONFIDENTIALITY NOTICE</font></font></font><br>
<font color="#7d7d7d"><font face="Verdana, sans-serif"><font
style="font-size: 7pt;" size="1">
This e-mail transmission and eventual attached files are intended only
for the use of the individual(s) or entity(ies) named above and may
contain
information that is both privileged and confidential and is exempt from
disclosure under applicable law. If you are not the intended recipient,
you are
hereby notified that any disclosure, copying, distribution or use of
any of the information contained in this transmission is strictly
restricted. If by any
means you have received this transmission in error, please immediately
notify the sender and delete this e-mail from your system. Thank you.
</font></font></font></div>
</div>
</body>
</html>

Attachment Content-Type Size
unknown_filename text/html 4.4 KB

From: Kris Jurka <books(at)ejurka(dot)com>
To: Luis Vilar Flores <lflores(at)evolute(dot)pt>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: bytea memory improvement
Date: 2006-09-26 06:53:06
Message-ID: Pine.BSO.4.63.0609260131040.29854@leary2.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Wed, 23 Aug 2006, Luis Vilar Flores wrote:

> To all that already forgot the first emails, I developed an modified
> version of the method toBytes from the org.postgresql.util.PGbytea
> class. The old method uses 3 buffers to translate the data from the
> nework to the client, this uses too much memory. My method only uses 2
> buffers, but does one more pass through the original buffer (to
> calculate it's final size).
>

I'm not super impressed with these timing results. They are certainly
showing some effects due to GC, consider the rise in time here at 10.5MB.

> OLD method:
> size: 9.5MB execute+next: 804ms getBytes: 377ms used mem: 66169KB
> size: 10.5MB execute+next: 634ms getBytes: 546ms used mem: 73112KB
> size: 11.5MB execute+next: 689ms getBytes: 450ms used mem: 80057KB
> size: 12.5MB execute+next: 748ms getBytes: 482ms used mem: 87001KB

I came up with my own contrived benchmark (attached) that attempts to
focus solely on the getBytes() call and avoid the time of fetching
results, but it doesn't give really consistent results and I haven't been
able to come up with a case that actually shows the new method was faster
even with 30MB of data. This is on Debian Linux / 2xOpteron 246 / jdk
1.5.0-05.

I've committed this to CVS HEAD with a rather arbitrarily set
MAX_3_BUFF_SIZE value of 2MB. Note that this is also the escaped size, so
we may actually be dealing with output data a quarter of that size. If
anyone could do some more testing of what a good crossover point would be
that would be a good thing.

Thanks for your patience with this item.

Kris Jurka

Attachment Content-Type Size
ByteaTest2.java text/plain 823 bytes

From: Luis Vilar Flores <lflores(at)evolute(dot)pt>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: bytea memory improvement
Date: 2006-09-26 10:09:10
Message-ID: 4518FC46.8020705@evolute.pt
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Kris Jurka wrote:
<blockquote cite="midPine(dot)BSO(dot)4(dot)63(dot)0609260131040(dot)29854(at)leary2(dot)csoft(dot)net"
type="cite"><br>
<br>
On Wed, 23 Aug 2006, Luis Vilar Flores wrote:
<br>
<br>
<blockquote type="cite">&nbsp; To all that already forgot the first
emails, I developed an modified version of the method toBytes from the
org.postgresql.util.PGbytea class.&nbsp; The old method uses 3 buffers to
translate the data from the nework to the client, this uses too much
memory.&nbsp; My method only uses 2 buffers, but does one more pass through
the original buffer (to calculate it's final size).
<br>
<br>
</blockquote>
<br>
I'm not super impressed with these timing results.&nbsp; They are certainly
showing some effects due to GC, consider the rise in time here at
10.5MB.
<br>
</blockquote>
Well, thanks a lot for the attention. My main purpose was to reduce the
memory footprint. But, before I did the tests, I had the idea that the
new method would be slower than the older ... So it would only be
better on large files, i.e. where the reduced memory usage was more
important than raw speed. This was because of the extra cycle through
the array.<br>
<blockquote cite="midPine(dot)BSO(dot)4(dot)63(dot)0609260131040(dot)29854(at)leary2(dot)csoft(dot)net"
type="cite"><br>
<blockquote type="cite">OLD method:
<br>
size: 9.5MB execute+next: 804ms getBytes: 377ms used mem: 66169KB
<br>
size: 10.5MB execute+next: 634ms getBytes: 546ms used mem: 73112KB
<br>
size: 11.5MB execute+next: 689ms getBytes: 450ms used mem: 80057KB
<br>
size: 12.5MB execute+next: 748ms getBytes: 482ms used mem: 87001KB
<br>
</blockquote>
<br>
I came up with my own contrived benchmark (attached) that attempts to
focus solely on the getBytes() call and avoid the time of fetching
results, but it doesn't give really consistent results and I haven't
been able to come up with a case that actually shows the new method was
faster even with 30MB of data.&nbsp; This is on Debian Linux / 2xOpteron 246
/ jdk 1.5.0-05.
<br>
</blockquote>
The new method is very similar to the old, but it just computes the
final size before the copy. The old method does less instructions to
convert an array, the new method is only faster when the older is
slowed down by garbage collection/memory allocation.<br>
<blockquote cite="midPine(dot)BSO(dot)4(dot)63(dot)0609260131040(dot)29854(at)leary2(dot)csoft(dot)net"
type="cite"><br>
I've committed this to CVS HEAD with a rather arbitrarily set
MAX_3_BUFF_SIZE value of 2MB.&nbsp; Note that this is also the escaped size,
so we may actually be dealing with output data a quarter of that size.&nbsp;
If anyone could do some more testing of what a good crossover point
would be that would be a good thing.
<br>
</blockquote>
I think the old option should be there for a while, but I hope that the
new method proves to be as fast as the old, so we can just discard the
MAX_3_BUFF_SIZE and always compute the final size - as the method code
would be clearer that way.<br>
<blockquote cite="midPine(dot)BSO(dot)4(dot)63(dot)0609260131040(dot)29854(at)leary2(dot)csoft(dot)net"
type="cite"><br>
Thanks for your patience with this item.
<br>
</blockquote>
It's me who thanks for such a great product ...<br>
I will check the new benchmark, the see the memory usage, and garbage
collection ...<br>
<blockquote cite="midPine(dot)BSO(dot)4(dot)63(dot)0609260131040(dot)29854(at)leary2(dot)csoft(dot)net"
type="cite"><br>
Kris Jurka<br>
<pre wrap="">
<hr size="4" width="90%">
import java.sql.*;

public class ByteaTest2 {

public static void main(String args[]) throws Exception {
Class.forName("org.postgresql.Driver");
Connection conn = DriverManager.getConnection("jdbc:postgresql://localhost:5432/jurka","jurka","");

for (int k=0; k&lt;5; k++) {
long t1 = System.currentTimeMillis();
long total = 0;

for (int j=0; j&lt;10; j++) {
PreparedStatement pstmt = conn.prepareStatement("SELECT varcharsend(repeat(?,?))");
pstmt.setString(1, "a\\001");
pstmt.setInt(2, 150000);
ResultSet rs = pstmt.executeQuery();
rs.next();
for (int i=0; i&lt;100; i++) {
byte b[] = rs.getBytes(1);
total += b.length;
}

rs.close();
pstmt.close();
}
long t2 = System.currentTimeMillis();
System.out.println(t2-t1);
}
}
}
</pre>
</blockquote>
<br>
<br>
<div class="moz-signature">-- <br>
<meta http-equiv="CONTENT-TYPE" content="text/html; ">
<title>Evolute - Luis Flores</title>
<p><font color="#7da647"><font face="Verdana, sans-serif"><font
style="font-size: 10pt;" size="2"> Luis Flores
</font></font></font></p>
<p>
<font color="#7da647"><font face="Verdana, sans-serif"><font
style="font-size: 8pt;" size="2"> Analista de Sistemas</font></font></font></p>
<p><a href="http://www.evolute.pt"><font face="Verdana, sans-serif"><font
style="font-size: 8pt;" size="2"><b>Evolute</b> - Consultoria
Inform&aacute;tica<br>
<br>
</font></font></a>
<font color="#7da647"><font face="Verdana, sans-serif"><font
style="font-size: 8pt;" size="2"> Email: </font></font></font>
<a href="mailto:lflores(at)evolute(dot)pt"><font face="Verdana, sans-serif"><font
style="font-size: 8pt;" size="2">lflores(at)evolute(dot)pt
</font></font></a></p>
<p>
<font color="#7da647"><font face="Verdana, sans-serif"><font
style="font-size: 8pt;" size="2"> Tel: (+351)
212949689</font></font></font></p>
<div style="text-align: justify;"><font color="#7d7d7d"><font
face="Verdana, sans-serif"><font style="font-size: 7pt;" size="1">
<br>
AVISO DE CONFIDENCIALIDADE</font></font></font><br>
<font color="#7d7d7d"><font face="Verdana, sans-serif"><font
style="font-size: 7pt;" size="1">
Esta mensagem de correio electr&oacute;nico e eventuais ficheiros
anexos s&atilde;o confidenciais e destinados apenas &agrave;(s)
pessoa(s) ou entidade(s) acima referida(s),
podendo conter informa&ccedil;&atilde;o privilegiada e
confidencial, a qual n&atilde;o poder&aacute; ser divulgada,
copiada, gravada ou distribu&iacute;da nos termos da lei vigente.
Caso n&atilde;o
seja o destinat&aacute;rio da mensagem, ou se ela lhe foi enviada
por engano, agradecemos que n&atilde;o fa&ccedil;a uso ou
divulga&ccedil;&atilde;o da mesma. A
distribui&ccedil;&atilde;o ou
utiliza&ccedil;&atilde;o da informa&ccedil;&atilde;o
nela contida &eacute; interdita. Se recebeu esta mensagem por
engano, por favor notifique o remetente e apague este e-mail do seu
sistema.
Obrigado.
<br>
</font></font></font><font color="#7d7d7d"><font
face="Verdana, sans-serif"><font style="font-size: 7pt;" size="1">
</font></font></font><br>
<font color="#7d7d7d"><font face="Verdana, sans-serif"><font
style="font-size: 7pt;" size="1">
CONFIDENTIALITY NOTICE</font></font></font><br>
<font color="#7d7d7d"><font face="Verdana, sans-serif"><font
style="font-size: 7pt;" size="1">
This e-mail transmission and eventual attached files are intended only
for the use of the individual(s) or entity(ies) named above and may
contain
information that is both privileged and confidential and is exempt from
disclosure under applicable law. If you are not the intended recipient,
you are
hereby notified that any disclosure, copying, distribution or use of
any of the information contained in this transmission is strictly
restricted. If by any
means you have received this transmission in error, please immediately
notify the sender and delete this e-mail from your system. Thank you.
</font></font></font></div>
</div>
</body>
</html>

Attachment Content-Type Size
unknown_filename text/html 7.4 KB

From: till toenges <tt(at)kyon(dot)de>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: Luis Vilar Flores <lflores(at)evolute(dot)pt>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: bytea memory improvement
Date: 2006-09-26 13:07:12
Message-ID: 45192600.5050704@kyon.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Kris Jurka wrote:
> I'm not super impressed with these timing results. They are certainly
> showing some effects due to GC, consider the rise in time here at 10.5MB.

The method isn't neccessarily much faster, especially when there are
only a few megabytes involved. This is very difficult to benchmark in
the presence of a garbage collector.

> I've committed this to CVS HEAD with a rather arbitrarily set
> MAX_3_BUFF_SIZE value of 2MB. Note that this is also the escaped size, so
> we may actually be dealing with output data a quarter of that size. If
> anyone could do some more testing of what a good crossover point would be
> that would be a good thing.

AFAIK the MAX_3_BUFF_SIZE entry was a debug artifact. Not needed any
more. The new method is always faster or at least as fast as the old
method, because it requires fewer memory accesses.

3 Buffers:

Buffer1 zeroing (vm intern)
Buffer1 filling

Buffer2 zeroing (vm intern)
Buffer1 reading
Buffer2 writing

Buffer3 zeroing (vm intern)
Buffer2 reading
Buffer3 writing

Total: 8 memory accesses.

Eventually Buffer3 reading, but that's not part of the driver.

2 Buffers:

Buffer1 zeroing (vm intern)
Buffer1 filling

Buffer1 reading (the new pass)

Buffer2 zeroing (vm intern)
Buffer1 reading
Buffer2 writing

Total: 6 memory accesses.

Conclusion: The new method uses less memory. It must be faster as well,
since everything else is fast in comparison to memory access.

Additionally, it requires only 2 allocations, and memory allocation have
some overhead as well, and mean more work for the garbage collector in
the end. Even if the VM can do some magic to avoid zeroing the buffers,
the newer method has one less memory access. It is always the winner.


From: Luis Vilar Flores <lflores(at)evolute(dot)pt>
To: till toenges <tt(at)kyon(dot)de>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: bytea memory improvement
Date: 2006-09-26 15:35:16
Message-ID: 451948B4.1060309@evolute.pt
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Your explanation is very simple and correctly explains both methods.<br>
<br>
Here are some more comments ...<br>
<br>
till toenges wrote:
<blockquote cite="mid45192600(dot)5050704(at)kyon(dot)de" type="cite">
<pre wrap="">Kris Jurka wrote:
</pre>
<blockquote type="cite">
<pre wrap="">I'm not super impressed with these timing results. They are certainly
showing some effects due to GC, consider the rise in time here at 10.5MB.
</pre>
</blockquote>
<pre wrap=""><!---->
The method isn't neccessarily much faster, especially when there are
only a few megabytes involved. This is very difficult to benchmark in
the presence of a garbage collector.

</pre>
<blockquote type="cite">
<pre wrap="">I've committed this to CVS HEAD with a rather arbitrarily set
MAX_3_BUFF_SIZE value of 2MB. Note that this is also the escaped size, so
we may actually be dealing with output data a quarter of that size. If
anyone could do some more testing of what a good crossover point would be
that would be a good thing.
</pre>
</blockquote>
<pre wrap=""><!---->
AFAIK the MAX_3_BUFF_SIZE entry was a debug artifact. Not needed any
</pre>
</blockquote>
It's almost correct, I would like the new code to be more tested before
it fully replaces the old - in large arrays there's a big memory
advantage, so it makes sense to replace, in small array it's almost the
same, so the old code can stay for a while ...<br>
<blockquote cite="mid45192600(dot)5050704(at)kyon(dot)de" type="cite">
<pre wrap="">more. The new method is always faster or at least as fast as the old
method, because it requires fewer memory accesses.

3 Buffers:

Buffer1 zeroing (vm intern)
Buffer1 filling

Buffer2 zeroing (vm intern)
Buffer1 reading
Buffer2 writing

Buffer3 zeroing (vm intern)
Buffer2 reading
Buffer3 writing

Total: 8 memory accesses.

Eventually Buffer3 reading, but that's not part of the driver.

2 Buffers:

Buffer1 zeroing (vm intern)
Buffer1 filling

Buffer1 reading (the new pass)

Buffer2 zeroing (vm intern)
Buffer1 reading
Buffer2 writing

Total: 6 memory accesses.

Conclusion: The new method uses less memory. It must be faster as well,
since everything else is fast in comparison to memory access.

Additionally, it requires only 2 allocations, and memory allocation have
some overhead as well, and mean more work for the garbage collector in
the end. Even if the VM can do some magic to avoid zeroing the buffers,
the newer method has one less memory access. It is always the winner.

</pre>
</blockquote>
Not all memory accesses are created equal :-), the Buffer1 is the
biggest buffer, and the new code pass one more time through it. The
last copy from Buffer3 to Buffer2 in the old method is done through
System.arraycopy, which I think is very, very fast (hardware based), so
the methods are more balanced ...<br>
For large arrays the new is ALWAYS much faster off course - due to
memory access.<br>
<br>
<blockquote cite="mid45192600(dot)5050704(at)kyon(dot)de" type="cite">
<pre wrap="">---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

</pre>
</blockquote>
<br>
<br>
<div class="moz-signature">-- <br>
<meta http-equiv="CONTENT-TYPE" content="text/html; ">
<title>Evolute - Luis Flores</title>
<p><font color="#7da647"><font face="Verdana, sans-serif"><font
style="font-size: 10pt;" size="2"> Luis Flores
</font></font></font></p>
<p>
<font color="#7da647"><font face="Verdana, sans-serif"><font
style="font-size: 8pt;" size="2"> Analista de Sistemas</font></font></font></p>
<p><a href="http://www.evolute.pt"><font face="Verdana, sans-serif"><font
style="font-size: 8pt;" size="2"><b>Evolute</b> - Consultoria
Inform&aacute;tica<br>
<br>
</font></font></a>
<font color="#7da647"><font face="Verdana, sans-serif"><font
style="font-size: 8pt;" size="2"> Email: </font></font></font>
<a href="mailto:lflores(at)evolute(dot)pt"><font face="Verdana, sans-serif"><font
style="font-size: 8pt;" size="2">lflores(at)evolute(dot)pt
</font></font></a></p>
<p>
<font color="#7da647"><font face="Verdana, sans-serif"><font
style="font-size: 8pt;" size="2"> Tel: (+351)
212949689</font></font></font></p>
<div style="text-align: justify;"><font color="#7d7d7d"><font
face="Verdana, sans-serif"><font style="font-size: 7pt;" size="1">
<br>
AVISO DE CONFIDENCIALIDADE</font></font></font><br>
<font color="#7d7d7d"><font face="Verdana, sans-serif"><font
style="font-size: 7pt;" size="1">
Esta mensagem de correio electr&oacute;nico e eventuais ficheiros
anexos s&atilde;o confidenciais e destinados apenas &agrave;(s)
pessoa(s) ou entidade(s) acima referida(s),
podendo conter informa&ccedil;&atilde;o privilegiada e
confidencial, a qual n&atilde;o poder&aacute; ser divulgada,
copiada, gravada ou distribu&iacute;da nos termos da lei vigente.
Caso n&atilde;o
seja o destinat&aacute;rio da mensagem, ou se ela lhe foi enviada
por engano, agradecemos que n&atilde;o fa&ccedil;a uso ou
divulga&ccedil;&atilde;o da mesma. A
distribui&ccedil;&atilde;o ou
utiliza&ccedil;&atilde;o da informa&ccedil;&atilde;o
nela contida &eacute; interdita. Se recebeu esta mensagem por
engano, por favor notifique o remetente e apague este e-mail do seu
sistema.
Obrigado.
<br>
</font></font></font><font color="#7d7d7d"><font
face="Verdana, sans-serif"><font style="font-size: 7pt;" size="1">
</font></font></font><br>
<font color="#7d7d7d"><font face="Verdana, sans-serif"><font
style="font-size: 7pt;" size="1">
CONFIDENTIALITY NOTICE</font></font></font><br>
<font color="#7d7d7d"><font face="Verdana, sans-serif"><font
style="font-size: 7pt;" size="1">
This e-mail transmission and eventual attached files are intended only
for the use of the individual(s) or entity(ies) named above and may
contain
information that is both privileged and confidential and is exempt from
disclosure under applicable law. If you are not the intended recipient,
you are
hereby notified that any disclosure, copying, distribution or use of
any of the information contained in this transmission is strictly
restricted. If by any
means you have received this transmission in error, please immediately
notify the sender and delete this e-mail from your system. Thank you.
</font></font></font></div>
</div>
</body>
</html>

Attachment Content-Type Size
unknown_filename text/html 6.4 KB

From: Luis Vilar Flores <lflores(at)evolute(dot)pt>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: bytea memory improvement - test results
Date: 2006-10-06 19:21:03
Message-ID: 4526AC9F.2010508@evolute.pt
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Hello,

These are the results from the Kris Jurka's ByteaTest2.java.
If there are more questions or someone want something else about this
patch just ask it...

I look forward to see a new driver with this patch ...

CeleronM(at)1(dot)6 1MB L2 cache
Fedora Core 5
jdk1.5.0_08
local db ver 8.1.4
jdbc driver ver 407

OLD
Time: 56261ms Mem: 2760776b
Time: 56077ms Mem: 2753088b
Time: 56181ms Mem: 2753088b
Time: 56184ms Mem: 2753088b
Time: 56259ms Mem: 2753088b

NEW
Time: 34603ms Mem: 1859656b
Time: 34438ms Mem: 1852776b
Time: 34409ms Mem: 1852776b
Time: 34610ms Mem: 1852776b
Time: 34496ms Mem: 1852776b

G5(at)1(dot)8 512KB L2 cache
Mac OS X 10.4.8
jdk1.5.0_06
remote db ver 8.1.4(on LAN)
jdbc driver ver 407

OLD
Time: 130390ms Mem: 2812064b
Time: 131060ms Mem: 2811776b
Time: 131039ms Mem: 2811776b
Time: 131627ms Mem: 2811776b
Time: 131772ms Mem: 2811776b

NEW
Time: 83940ms Mem: 1911752b
Time: 83244ms Mem: 1911464b
Time: 83350ms Mem: 1911464b
Time: 83457ms Mem: 1911464b
Time: 83610ms Mem: 1911464b

Attachment Content-Type Size
ByteaTest2.java text/x-java 935 bytes