Multiple insert performance trick or performance misunderstanding?

Lists: pgsql-performance
From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Multiple insert performance trick or performance misunderstanding?
Date: 2005-09-24 20:51:16
Message-ID: dh4e88$25qe$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

When I need to insert a few hundred or thousand things in
a table from a 3-tier application, it seems I'm much better
off creating a big string of semicolon separated insert
statements rather than sending them one at a time - even
when I use the obvious things like wrapping the statements
in a transaction and using the library's prepared statements.

I tried both Ruby/DBI and C#/Npgsql; and in both cases
sets of inserts that took 3 seconds when run individually
took about 0.7 seconds when concatenated together.

Is it expected that I'd be better off sending big
concatenated strings like
"insert into tbl (c1,c2) values (v1,v2);insert into tbl (c1,c2) values (v3,v4);..."
instead of sending them one at a time?

db.ExecuteSQL("BEGIN");
sql = new System.Text.StringBulder(10000);
for ([a lot of data elements]) {
sql.Append(
"insert into user_point_features (col1,col2)"+
" values (" +obj.val1 +","+obj.val2+");"
);
}
db.ExecuteSQL(sql.ToString());
db.ExecuteSQL("COMMIT");


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Multiple insert performance trick or performance misunderstanding?
Date: 2005-09-24 21:15:41
Message-ID: 23702.1127596541@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com> writes:
> Is it expected that I'd be better off sending big
> concatenated strings like
> "insert into tbl (c1,c2) values (v1,v2);insert into tbl (c1,c2) values (v3,v4);..."
> instead of sending them one at a time?

It's certainly possible, if the network round trip from client to server
is slow. I do not think offhand that there is any material advantage
for the processing within the server (assuming you've wrapped the whole
thing into one transaction in both cases); if anything, the
concatenated-statement case is probably a bit worse inside the server
because it will transiently eat more memory. But network latency or
client-side per-command overhead could well cause the results you see.

regards, tom lane