Re: How to implement the skip errors for copy from ?

Lists: pgsql-hackers
From: xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: How to implement the skip errors for copy from ?
Date: 2014-06-16 09:46:38
Message-ID: 2014061617463628926328@kingbase.com.cn
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I want to?implement the skip errors for copy from,lik as :create table A (c int primary key);copy A from stdin;112\.
copy will failed:ERROR: duplicate key violates primary key constraint "CC_PKEY"?
CONTEXT: COPY CC, line 2: "1"
I want skip the error, and continue to copy the reset of tuple. The resultwill be that there are two rows in table A: 1 and 2.
how to?implement that ? Anybody give me some?suggestion?

张晓博?? 研发二部
北京人大金仓信息技术股份有限公司
地址:北京市海淀区上地西路八号院上地科技大厦4号楼501
邮编:100085
电话:(010) 5885 1118 - 8450
手机:15311394463
邮箱:xbzhang(at)kingbase(dot)com(dot)cn


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How to implement the skip errors for copy from ?
Date: 2014-06-16 09:54:22
Message-ID: CAFj8pRBgTY+nW+tjmtTZb5an7n55nzV7rpkkkecGzQAeid1ahA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2014-06-16 11:46 GMT+02:00 xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>:

>
> I want to implement the skip errors for copy from,lik as :
> create table A (c int primary key);
> copy A from stdin;
> 1
> 1
> 2
> \.
>
> copy will failed:
> ERROR: duplicate key violates primary key constraint "CC_PKEY"
> CONTEXT: COPY CC, line 2: "1"
>
> I want skip the error, and continue to copy the reset of tuple. The result
> will be that there are two rows in table A: 1 and 2.
>
> how to implement that ? Anybody give me some suggestion?
>

you should to reimplement a copy procedure to use a subtransactions. Using
subtransaction for any row is too expensive, but you can do subtransaction
per 1000 rows, and when some exception is raised, then store data per one
row/one subtransaction.

Regards

Pavel Stehule

>
> ------------------------------
>
> 张晓博 研发二部
>
> 北京人大金仓信息技术股份有限公司
>
> 地址:北京市海淀区上地西路八号院上地科技大厦4号楼501
>
> 邮编:100085
>
> 电话:(010) 5885 1118 - 8450
>
> 手机:15311394463
>
> 邮箱:xbzhang(at)kingbase(dot)com(dot)cn
>


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How to implement the skip errors for copy from ?
Date: 2014-06-16 18:37:41
Message-ID: 20140616183741.GA18688@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule wrote:
> 2014-06-16 11:46 GMT+02:00 xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>:
>
> >
> > I want to implement the skip errors for copy from,lik as :
> > create table A (c int primary key);
> > copy A from stdin;
> > 1
> > 1
> > 2
> > \.
> >
> > copy will failed:
> > ERROR: duplicate key violates primary key constraint "CC_PKEY"
> > CONTEXT: COPY CC, line 2: "1"
> >
> > I want skip the error, and continue to copy the reset of tuple. The result
> > will be that there are two rows in table A: 1 and 2.
> >
> > how to implement that ? Anybody give me some suggestion?
>
> you should to reimplement a copy procedure to use a subtransactions. Using
> subtransaction for any row is too expensive, but you can do subtransaction
> per 1000 rows, and when some exception is raised, then store data per one
> row/one subtransaction.

See http://pgloader.io/ for a ready-made solution.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>
To: "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How to implement the skip errors for copy from ?
Date: 2014-06-17 05:30:22
Message-ID: 2014061711543320946949@kingbase.com.cn
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Use subtransaction , the tuples that had inserted into heap  must be inserted again  when some exception is raised,it is too expensive.My solution is :1. delete the tuple that caused the error tuple;2. release all the resources when  inserting  the tuple;3. continue insert next tupleIs it feasible?  Anybody give me some suggestion?

张晓博   研发二部
北京人大金仓信息技术股份有限公司
地址:北京市海淀区上地西路八号院上地科技大厦4号楼501
邮编:100085
电话:(010) 5885 1118 - 8450
手机:15311394463
邮箱:xbzhang(at)kingbase(dot)com(dot)cn
 From: Alvaro HerreraDate: 2014-06-17 02:37To: Pavel StehuleCC: xbzhang; pgsql-hackersSubject: Re: [HACKERS] How to implement the skip errors for copy from ?Pavel Stehule wrote:
> 2014-06-16 11:46 GMT+02:00 xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>:
>
> >
> > I want to implement the skip errors for copy from,lik as :
> > create table A (c int primary key);
> > copy A from stdin;
> > 1
> > 1
> > 2
> > \.
> >
> > copy will failed:
> > ERROR: duplicate key violates primary key constraint "CC_PKEY"
> > CONTEXT: COPY CC, line 2: "1"
> >
> > I want skip the error, and continue to copy the reset of tuple. The result
> > will be that there are two rows in table A: 1 and 2.
> >
> > how to implement that ? Anybody give me some suggestion?
>
> you should to reimplement a copy procedure to use a subtransactions. Using
> subtransaction for any row is too expensive, but you can do subtransaction
> per 1000 rows, and when some exception is raised, then store data per one
> row/one subtransaction.
 
See http://pgloader.io/ for a ready-made solution.
 
--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
 
 
 
 
-----
???????????
????:AVG - www.avg.com
??:2013.0.3480 / ?????:3955/7685 - ????:06/16/14
 


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How to implement the skip errors for copy from ?
Date: 2014-06-17 06:01:45
Message-ID: CAFj8pRAKbAcHYv0zLvA=yyiruysi+AuwMn=R1xsopV-CgRTvaw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2014-06-17 7:30 GMT+02:00 xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>:

> Use subtransaction , the tuples that had inserted into heap must be
> inserted again when some exception is raised,
> it is too expensive.
> My solution is :
> 1. delete the tuple that caused the error tuple;
> 2. release all the resources when inserting the tuple;
> 3. continue insert next tuple
> Is it feasible? Anybody give me some suggestion?
>

no, it should not work - after any exception some memory structures should
be in undefined state. Errors in PostgreSQL are destructive and any error
must be followed by ROLLBACK.

Subtransaction for any row is expensive, but subtransaction for some block
is cheap

Regards

Pavel

>
> ------------------------------
>
> 张晓博 研发二部
>
> 北京人大金仓信息技术股份有限公司
>
> 地址:北京市海淀区上地西路八号院上地科技大厦4号楼501
>
> 邮编:100085
>
> 电话:(010) 5885 1118 - 8450
>
> 手机:15311394463
>
> 邮箱:xbzhang(at)kingbase(dot)com(dot)cn
>
>
> *From:* Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
> *Date:* 2014-06-17 02:37
> *To:* Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> *CC:* xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>; pgsql-hackers
> <pgsql-hackers(at)postgresql(dot)org>
> *Subject:* Re: [HACKERS] How to implement the skip errors for copy from ?
> Pavel Stehule wrote:
> > 2014-06-16 11:46 GMT+02:00 xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>:
> >
> > >
> > > I want to implement the skip errors for copy from,lik as :
> > > create table A (c int primary key);
> > > copy A from stdin;
> > > 1
> > > 1
> > > 2
> > > \.
> > >
> > > copy will failed:
> > > ERROR: duplicate key violates primary key constraint "CC_PKEY"
> > > CONTEXT: COPY CC, line 2: "1"
> > >
> > > I want skip the error, and continue to copy the reset of tuple. The
> result
> > > will be that there are two rows in table A: 1 and 2.
> > >
> > > how to implement that ? Anybody give me some suggestion?
> >
> > you should to reimplement a copy procedure to use a subtransactions.
> Using
> > subtransaction for any row is too expensive, but you can do
> subtransaction
> > per 1000 rows, and when some exception is raised, then store data per one
> > row/one subtransaction.
>
> See http://pgloader.io/ for a ready-made solution.
>
> --
> Álvaro Herrera http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
>
>
>
> -----
> ???????????
> ????:AVG - www.avg.com
> ??:2013.0.3480 / ?????:3955/7685 - ????:06/16/14
>
>
>


From: xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>
To: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
Cc: "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How to implement the skip errors for copy from ?
Date: 2014-06-17 06:46:24
Message-ID: 2014061714462409019860@kingbase.com.cn
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

one resource owner per tuples, when error happens, only release resource owner belong to error tuple.Why some memory structures should be in undefined state? Can you give some examples?

 From: Pavel StehuleDate: 2014-06-17 14:01To: xbzhangCC: Alvaro Herrera; pgsql-hackersSubject: Re: Re: [HACKERS] How to implement the skip errors for copy from ?

2014-06-17 7:30 GMT+02:00 xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>:

Use subtransaction , the tuples that had inserted into heap  must be inserted again  when some exception is raised,

it is too expensive.My solution is :1. delete the tuple that caused the error tuple;2. release all the resources when  inserting  the tuple;

3. continue insert next tupleIs it feasible?  Anybody give me some suggestion?

no, it should not work - after any exception some memory structures should be in undefined state. Errors in PostgreSQL are destructive and any error must be followed by ROLLBACK.

Subtransaction for any row is expensive, but subtransaction for some block is cheap

Regards

Pavel
 

张晓博   研发二部
北京人大金仓信息技术股份有限公司
地址:北京市海淀区上地西路八号院上地科技大厦4号楼501

邮编:100085
电话:(010) 5885 1118 - 8450
手机:15311394463
邮箱:xbzhang(at)kingbase(dot)com(dot)cn

 

From: Alvaro HerreraDate: 2014-06-17 02:37To: Pavel Stehule

CC: xbzhang; pgsql-hackersSubject: Re: [HACKERS] How to implement the skip errors for copy from ?

Pavel Stehule wrote:
> 2014-06-16 11:46 GMT+02:00 xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>:
>
> >
> > I want to implement the skip errors for copy from,lik as :
> > create table A (c int primary key);
> > copy A from stdin;
> > 1
> > 1
> > 2
> > \.
> >
> > copy will failed:
> > ERROR: duplicate key violates primary key constraint "CC_PKEY"
> > CONTEXT: COPY CC, line 2: "1"
> >
> > I want skip the error, and continue to copy the reset of tuple. The result
> > will be that there are two rows in table A: 1 and 2.
> >
> > how to implement that ? Anybody give me some suggestion?
>
> you should to reimplement a copy procedure to use a subtransactions. Using
> subtransaction for any row is too expensive, but you can do subtransaction
> per 1000 rows, and when some exception is raised, then store data per one
> row/one subtransaction.
 
See http://pgloader.io/ for a ready-made solution.
 
--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
 
 
 
 
-----
???????????
????:AVG - www.avg.com
??:2013.0.3480 / ?????:3955/7685 - ????:06/16/14
 

在此邮件中未发现病毒。

检查工具:AVG - www.avg.com

版本:2013.0.3480 / 病毒数据库:3955/7689 - 发布日期:06/16/14


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How to implement the skip errors for copy from ?
Date: 2014-06-17 06:51:21
Message-ID: CAFj8pRBx5mk640+mcaf_jvuB45_0H-Dk7oQ7Xej0O2fq=q8Cuw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2014-06-17 8:46 GMT+02:00 xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>:

> one resource owner per tuples, when error happens, only release resource
> owner belong to error tuple.
> Why some memory structures should be in undefined state? Can you give
> some examples?
>

there can be raised any exception -- any non fatal exception. I remember,
when I wrote some similar without exception, then it was very unstable.

Pavel

>
>
> *From:* Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> *Date:* 2014-06-17 14:01
> *To:* xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>
> *CC:* Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>; pgsql-hackers
> <pgsql-hackers(at)postgresql(dot)org>
> *Subject:* Re: Re: [HACKERS] How to implement the skip errors for copy
> from ?
>
>
>
> 2014-06-17 7:30 GMT+02:00 xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>:
>
>> Use subtransaction , the tuples that had inserted into heap must be
>> inserted again when some exception is raised,
>> it is too expensive.
>> My solution is :
>> 1. delete the tuple that caused the error tuple;
>> 2. release all the resources when inserting the tuple;
>> 3. continue insert next tuple
>> Is it feasible? Anybody give me some suggestion?
>>
>
> no, it should not work - after any exception some memory structures should
> be in undefined state. Errors in PostgreSQL are destructive and any error
> must be followed by ROLLBACK.
>
> Subtransaction for any row is expensive, but subtransaction for some block
> is cheap
>
> Regards
>
> Pavel
>
>
>>
>> ------------------------------
>>
>> 张晓博 研发二部
>>
>> 北京人大金仓信息技术股份有限公司
>>
>> 地址:北京市海淀区上地西路八号院上地科技大厦4号楼501
>>
>> 邮编:100085
>>
>> 电话:(010) 5885 1118 - 8450
>>
>> 手机:15311394463
>>
>> 邮箱:xbzhang(at)kingbase(dot)com(dot)cn
>>
>>
>> *From:* Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
>> *Date:* 2014-06-17 02:37
>> *To:* Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
>> *CC:* xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>; pgsql-hackers
>> <pgsql-hackers(at)postgresql(dot)org>
>> *Subject:* Re: [HACKERS] How to implement the skip errors for copy from ?
>> Pavel Stehule wrote:
>> > 2014-06-16 11:46 GMT+02:00 xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>:
>> >
>> > >
>> > > I want to implement the skip errors for copy from,lik as :
>> > > create table A (c int primary key);
>> > > copy A from stdin;
>> > > 1
>> > > 1
>> > > 2
>> > > \.
>> > >
>> > > copy will failed:
>> > > ERROR: duplicate key violates primary key constraint "CC_PKEY"
>> > > CONTEXT: COPY CC, line 2: "1"
>> > >
>> > > I want skip the error, and continue to copy the reset of tuple. The
>> result
>> > > will be that there are two rows in table A: 1 and 2.
>> > >
>> > > how to implement that ? Anybody give me some suggestion?
>> >
>> > you should to reimplement a copy procedure to use a subtransactions.
>> Using
>> > subtransaction for any row is too expensive, but you can do
>> subtransaction
>> > per 1000 rows, and when some exception is raised, then store data per
>> one
>> > row/one subtransaction.
>>
>> See http://pgloader.io/ for a ready-made solution.
>>
>> --
>> Álvaro Herrera http://www.2ndQuadrant.com/
>> PostgreSQL Development, 24x7 Support, Training & Services
>>
>>
>>
>>
>> -----
>> ???????????
>> ????:AVG - www.avg.com
>> ??:2013.0.3480 / ?????:3955/7685 - ????:06/16/14
>>
>>
>>
> 在此邮件中未发现病毒。
> 检查工具:AVG - www.avg.com
> 版本:2013.0.3480 / 病毒数据库:3955/7689 - 发布日期:06/16/14
>
>


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How to implement the skip errors for copy from ?
Date: 2014-06-17 08:40:46
Message-ID: CAA4eK1LJcgomEJYDPP1E-O=_gxdCWP3fVZRYmbrtuXv3_fKCaA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 17, 2014 at 12:16 PM, xbzhang <xbzhang(at)kingbase(dot)com(dot)cn> wrote:
>
> one resource owner per tuples, when error happens, only release resource
owner belong to error tuple.
> Why some memory structures should be in undefined state? Can you give
some examples?

There might be some LWlocks which might have been taken
before error and you won't know which one to free. Another
is that postgres uses memory context to allocate/free memory
in most places, so there can be allocated memory which needs
to be released, transaction/sub-transaction abort takes care of all
such and many more similar things.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>
To: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How to implement the skip errors for copy from ?
Date: 2014-06-17 09:09:32
Message-ID: 2014061717093157661367@kingbase.com.cn
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

LWlocks can record in resource owner per tuples, so they can be released at rigth way, but the memory allocated on memory context is one problem.Are there any others problems?

张晓博   研发二部
北京人大金仓信息技术股份有限公司
地址:北京市海淀区上地西路八号院上地科技大厦4号楼501
邮编:100085
电话:(010) 5885 1118 - 8450
手机:15311394463
邮箱:xbzhang(at)kingbase(dot)com(dot)cn
 From: Amit KapilaDate: 2014-06-17 17:10To: xbzhangCC: Pavel Stehule; Alvaro Herrera; pgsql-hackersSubject: Re: [HACKERS] How to implement the skip errors for copy from ?On Tue, Jun 17, 2014 at 12:16 PM, xbzhang <xbzhang(at)kingbase(dot)com(dot)cn> wrote:
>
> one resource owner per tuples, when error happens, only release resource owner belong to error tuple.

> Why some memory structures should be in undefined state? Can you give some examples?

There might be some LWlocks which might have been takenbefore error and you won't know which one to free.  Another
is that postgres uses memory context to allocate/free memoryin most places, so there can be allocated memory which needsto be released, transaction/sub-transaction abort takes care of all
such and many more similar things.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

在此邮件中未发现病毒。

检查工具:AVG - www.avg.com

版本:2013.0.3480 / 病毒数据库:3955/7689 - 发布日期:06/16/14


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>
Cc: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How to implement the skip errors for copy from ?
Date: 2014-06-17 13:40:43
Message-ID: 16380.1403012443@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

xbzhang <xbzhang(at)kingbase(dot)com(dot)cn> writes:
> LWlocks can record in resource owner per tuples, so they can be releasedat rigth way, but the memory allocated on memory contextis one problem.Are there any others problems?

See AbortSubTransaction(), CleanupSubTransaction(), and the rather large
number of subroutines they call. Almost everything that code does is
connected to cleaning up something that might have been left unfinished
after an elog(ERROR) took control away in the middle of some code
sequence.

In addition, you can't just wave your hands and presto the bad tuple is
not there anymore. For example, the failure might have been a unique key
violation in some index or other. Not only is the bad tuple already on
disk, but possibly so are index entries for it in other indexes. In
general the only way to get rid of those index entries is a VACUUM.
So you really have to have a subtransaction whose XID is what you mark
the new tuple with, and then rolling back the subtransaction is what
causes the new tuple to not be seen as good. (Actually getting rid of
it will be left for the next VACUUM.)

regards, tom lane


From: xbzhang <xbzhang(at)kingbase(dot)com(dot)cn>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How to implement the skip errors for copy from ?
Date: 2014-06-18 03:21:20
Message-ID: 2014061811212012467393@kingbase.com.cn
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Use subtraction is very inefficient, a project called pg_bulkload support the?skip errors ,and it does not useing subtraction. It?performance is very good.??So I want to imitate?pg_bulkload?to?implementation?skip errors of copy.if i do the following thing to copy :1. disable all of trigger of table;2. only skip the following errors:? ? ?* format error of tuple;? ? ?* check constraint?violation;? ? ?* unique or primary key?constraint?violation;? ? other errors will be abort current transcation except above three errors.3. ?bad tuple will be deleted and the?resource owner per tuples is reseased?? ? when?a unique key?violation, ?xmax of the tuple will?be marked to current? ? transaction id as it?not be seen as good, and all of index?entries?of the bad? ? tuple will?be real delete at next VACUUM.Is it right to?skip errors for copy from?

From:?Tom LaneDate:?2014-06-17?21:40To:?xbzhangCC:?Amit Kapila; Pavel Stehule; Alvaro Herrera; pgsql-hackersSubject:?Re: [HACKERS] How to implement the skip errors for copy from ?xbzhang <xbzhang(at)kingbase(dot)com(dot)cn> writes:
> LWlocks can record in resource owner per tuples, so they can be releasedat rigth way, but the memory allocated on memory contextis one problem.Are there any others problems?
?
See AbortSubTransaction(), CleanupSubTransaction(), and the rather large
number of subroutines they call.? Almost everything that code does is
connected to cleaning up something that might have been left unfinished
after an elog(ERROR) took control away in the middle of some code
sequence.
?
In addition, you can't just wave your hands and presto the bad tuple is
not there anymore.? For example, the failure might have been a unique key
violation in some index or other.? Not only is the bad tuple already on
disk, but possibly so are index entries for it in other indexes.? In
general the only way to get rid of those index entries is a VACUUM.
So you really have to have a subtransaction whose XID is what you mark
the new tuple with, and then rolling back the subtransaction is what
causes the new tuple to not be seen as good.? (Actually getting rid of
it will be left for the next VACUUM.)
?
regards, tom lane
?
?
?
?
-----
???????????
????:AVG - www.avg.com
??:2013.0.3480 / ?????:3955/7691 - ????:06/17/14
?