Re: Replacing a table with constraints

From: Ragnar Hafstað <gnari(at)simnet(dot)is>
To: Mark Fenbers <Mark(dot)Fenbers(at)noaa(dot)gov>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Replacing a table with constraints
Date: 2005-05-13 22:43:19
Message-ID: 1116024199.11747.37.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Fri, 2005-05-13 at 14:38 -0400, Mark Fenbers wrote:
> I have a table called Counties which partially contains a lot bad
> data. By" bad data", I mean some records are missing; some exist and
> shouldn't; and some records have fields with erroneous information.
> However, the majority of the data in the table is accurate. I have
> built/loaded a new table called newCounties with the same structure as
> Counties, but contains no bad data. My was to completely replace the
> contents of Counties with the contents of newCounties. The problem
> is: several other tables have Foreign Key constraints placed on
> Counties. Therefore, Pg will not let me 'DELETE FROM Counties;", nor
> will it let me "DROP TABLE Counties;"
>
> I'm perplexed. Can someone suggest how I can best get data from
> Counties to look just like newCounties?

assuming same primary key on these 2 tables, you have to
consider these cases:

a) records in Counties not in newCounties
b) records in newCounties existing in Counties, but different
c) records in newCounties also correct in Counties
d) records in newCounties missing from Counties

a) get rid of extra records:
delete from Counties
where primarykey not in
(select primarykey from newCounties);
if you hit foreign key constrains, you need to look
at your data a bit more, anyways

b) update erroneous records:
update Counties from newCounties set
col1= newCounties.col1,
col2= newCounties.col2,
col3= newCounties.col3,
...
from newCounties
where primarykey=newCounties.primarykey
and
( Counties.col1 <> newCounties.col1
OR Counties.col2 <> newCounties.col2
OR Counties.col3 <> newCounties.col3
... );

c) nothing to do here

d) insert into Counties
select * from newCounties
where primarykey not in
(select primarykey from Counties);

test case:

test=# create table c (p int, col1 int, col2 int);
CREATE TABLE
test=# insert into c values (1,1,1); -- correct values
INSERT 7693959 1
test=# insert into c values (2,1,2); -- incorrect
INSERT 7693960 1
test=# insert into c values (9,9,9); -- extra value
INSERT 7693961 1
test=# create table newc (p int, col1 int, col2 int);
CREATE TABLE
test=# insert into newc values (1,1,1); -- correct values
INSERT 7693964 1
test=# insert into newc values (2,2,2); -- incorrect in c
INSERT 7693965 1
test=# insert into newc values (3,3,3); -- missing in c
INSERT 7693966 1
test=# delete from c where p not in (select p from newc);
DELETE 1
test=# update c set col1=n.col1, col2=n.col2
test-# from newc as n
test-# where c.p=n.p and (c.col1<>n.col1 or c.col2<>n.col2);
UPDATE 1
test=# insert into c select * from newc where p not in (select p from
c);
INSERT 7693967 1
test=# select * from c;
p | col1 | col2
---+------+------
1 | 1 | 1
2 | 2 | 2
3 | 3 | 3
(3 rows)

note:
b) and c) can be merged and simplified if you don't mind updating
unchanged records:
update Counties from newCounties set
col1= newCounties.col1,
col2= newCounties.col2,
col3= newCounties.col3,
...
from newCounties
where primarykey=newCounties.primarykey;

gnari

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Rajesh Mallah 2005-05-15 15:33:37 Significance of Database Encoding
Previous Message Scott Marlowe 2005-05-13 20:03:51 Re: Replacing a table with constraints