Lists: | pgsql-sql |
---|
From: | Mark Fenbers <Mark(dot)Fenbers(at)noaa(dot)gov> |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | import ignoring duplicates |
Date: | 2010-05-16 18:38:01 |
Message-ID: | 4BF03B89.9020908@noaa.gov |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
I am using psql's \copy command to add records to a database from a
file. The file has over 100,000 lines. Occasionally, there is a
duplicate, and the import ceases and an internal rollback is performed.
In other words, no data is imported even if the first error occurs near
the end of the file.
I am looking for an option/switch to tell psql (or the \copy command) to
skip over any duplicate key constraint viloations and continue to load
any data that doesn't violate a duplicate key constraint. Is there such
an option?
Mark
Attachment | Content-Type | Size |
---|---|---|
Mark_Fenbers.vcf | text/x-vcard | 402 bytes |
From: | Tim Landscheidt <tim(at)tim-landscheidt(dot)de> |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: import ignoring duplicates |
Date: | 2010-05-17 00:32:00 |
Message-ID: | m3y6fjwepr.fsf@passepartout.tim-landscheidt.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
Mark Fenbers <Mark(dot)Fenbers(at)noaa(dot)gov> wrote:
> I am using psql's \copy command to add records to a database
> from a file. The file has over 100,000 lines.
> Occasionally, there is a duplicate, and the import ceases
> and an internal rollback is performed. In other words, no
> data is imported even if the first error occurs near the end
> of the file.
> I am looking for an option/switch to tell psql (or the \copy
> command) to skip over any duplicate key constraint
> viloations and continue to load any data that doesn't
> violate a duplicate key constraint. Is there such an
> option?
No. You can either disable the constraint temporarily, im-
port the data, fix any duplicates and re-enable the con-
straint, or you can load the data in a temporary table and
then transfer only the valid data. With only 100000 records
I would opt for the latter.
Tim
From: | Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> |
---|---|
To: | Mark Fenbers <Mark(dot)Fenbers(at)noaa(dot)gov> |
Cc: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: import ignoring duplicates |
Date: | 2010-05-17 06:04:17 |
Message-ID: | AANLkTila7aZ9_VwSnnH47dBZ47MqsZmOY9OtiKSZviwS@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
On Sun, May 16, 2010 at 12:38 PM, Mark Fenbers <Mark(dot)Fenbers(at)noaa(dot)gov> wrote:
> I am using psql's \copy command to add records to a database from a file.
> The file has over 100,000 lines. Occasionally, there is a duplicate, and
> the import ceases and an internal rollback is performed. In other words, no
> data is imported even if the first error occurs near the end of the file.
>
> I am looking for an option/switch to tell psql (or the \copy command) to
> skip over any duplicate key constraint viloations and continue to load any
> data that doesn't violate a duplicate key constraint. Is there such an
> option?
Sounds like you want this:
http://pgfoundry.org/projects/pgloader/
Note that copy is optimized to work in a single transaction. Breaking
those semantics WILL result in a slow load time, and there's not much
you can do about that.