Re: Adapter update.

Lists: pgsql-general
From: Murali Maddali <murali(dot)maddali(at)uai(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Adapter update.
Date: 2007-08-22 17:42:20
Message-ID: 76758090F8686C47A44B6FF52514A1D308C9CA40@hermes.uai.int
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hello Group,

I have asked this question already on the NpgSql forum, but didn't get a
response so far. Sorry for cross posting, but want to check if any one had
any suggestions for my problem.

I am trying to do my updates through NpgsqlDataAdapter (I also tried with
Odbc driver with no luck) by passing in a Datatable with changes in it, this
would take forever to do the updates.

This is what I am doing, I am reading the data from SQL Server 2005 and
dumping to out to Postgresql 8.2 database.

using (SqlCommand cmd = new SqlCommand(t.SourceSelect, conn))
{
using (SqlDataReader r = cmd.ExecuteReader())
{
DataSet ds = new DataSet("postgis");
NpgsqlDataAdapter adp = new
NpgsqlDataAdapter(t.DestinationSelect, destConn);
NpgsqlCommandBuilder cmdBld = new
NpgsqlCommandBuilder(adp);
adp.Fill(ds, t.DestinationTable);
DataTable destTbl = ds.Tables[t.DestinationTable];

DataTable srcTblSchema = r.GetSchemaTable();
adp.FillSchema(ds, SchemaType.Mapped,
t.DestinationTable);

// My save method will check if the row exists or not
and would add or update accordingly to the datatable (destTbl). The whole
process
// of comparision is done under 2 mins on 60,000
records.
while (r.Read())
_save(r, srcTblSchema, destTbl, destConn);

r.Close();


// This is the where my application goes into lala land.
If I call this update in my while loop above, it took about two hours to
process
// the whole thing
adp.Update(destTbl);
}
}

I have around 60000 records. I also have a geometry field on my table.

I have couple of questions.

1) What do I do to speed up the process? Any database configuration changes,
connection properties, ....
2) When I call the adapter.update does NpgsqlDataAdapter checks to see if
the column value really changed or not? I believe SQLDataAdapter does this
validation before it actually writes to the database.

Any suggestions and comments are greatly appreciated. Right now I am in dead
waters and can't get it to work on large datasets.

Thank you all.

Regards,
Murali K. Maddali
UAI, Inc.
murali(dot)maddali(at)uai(dot)com <mailto:murali(dot)maddali(at)uai(dot)com>

"Always bear in mind that your own resolution to succeed is more important
than any one thing." - Abraham Lincoln


This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the sender. This
message contains confidential information and is intended only for the
individual named. If you are not the named addressee you should not
disseminate, distribute or copy this e-mail.


From: Richard Huxton <dev(at)archonet(dot)com>
To: Murali Maddali <murali(dot)maddali(at)uai(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Adapter update.
Date: 2007-08-22 19:41:04
Message-ID: 46CC9150.7040605@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Murali Maddali wrote:
> This is what I am doing, I am reading the data from SQL Server 2005 and
> dumping to out to Postgresql 8.2 database.

> while (r.Read())
> _save(r, srcTblSchema, destTbl, destConn);
>
> r.Close();
>
>
> // This is the where my application goes into lala land.
> If I call this update in my while loop above, it took about two hours to
> process
> // the whole thing
> adp.Update(destTbl);

That's probably because it was doing each update in its own transaction.
That'll require committing each row to disk.

> I have around 60000 records. I also have a geometry field on my table.
>
> I have couple of questions.
>
> 1) What do I do to speed up the process? Any database configuration changes,
> connection properties, ....

Well, if you're doing it all in its own transaction it should be fairly
quick.

You might also find the DBI-link project useful, if you know any Perl.
That would let you reach out directly from PG to the SQL-Server database.
http://pgfoundry.org/projects/dbi-link/

> 2) When I call the adapter.update does NpgsqlDataAdapter checks to see if
> the column value really changed or not? I believe SQLDataAdapter does this
> validation before it actually writes to the database.

Sorry, don't know - but you have the source, should be easy enough to
check. If not, I'm sure the npgsql people would be happy of a patch.

> Any suggestions and comments are greatly appreciated. Right now I am in dead
> waters and can't get it to work on large datasets.

Fastest way to load data into PG is via COPY, don't know if npgsql
driver supports that. If not, you'd have to go via a text-file.

Load the data into an import table (TEMPORARY table probably) and then
just use three queries to handle deletion, update and insertion.
Comparing one row at a time is adding a lot of overhead.

--
Richard Huxton
Archonet Ltd


From: Ow Mun Heng <Ow(dot)Mun(dot)Heng(at)wdc(dot)com>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: Murali Maddali <murali(dot)maddali(at)uai(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Adapter update.
Date: 2007-09-07 03:06:30
Message-ID: 1189134390.17218.40.camel@neuromancer.home.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, 2007-08-22 at 20:41 +0100, Richard Huxton wrote:
> Murali Maddali wrote:
> > This is what I am doing, I am reading the data from SQL Server 2005 and
> > dumping to out to Postgresql 8.2 database.

My 2 cents.. I'm doing roughly the same thing, but I'm using perl and
DBI to do it.

> Fastest way to load data into PG is via COPY, don't know if npgsql
> driver supports that. If not, you'd have to go via a text-file.
>
> Load the data into an import table (TEMPORARY table probably) and then
> just use three queries to handle deletion, update and insertion.
> Comparing one row at a time is adding a lot of overhead.

My way of doing it..

1. pull from SQL Server via DBI to temp csv file.
2. Import via \copy into PG to temp table.
begin transaction
3. DElete duplicate pkey entries in actual table
4. insert new entries into actual table
5, truncate temp table
6. update a log file
end transaction.

works great..

Note on [3]..all data are new.. so instead of just doing update, I
resorted to doing a delete like the mysql's mysqlimport --replace
command. (my choice)