Re: COPY enhancements

From: Emmanuel Cecchet <manu(at)asterdata(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Emmanuel Cecchet <Emmanuel(dot)Cecchet(at)asterdata(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Selena Deckelmann <selenamarie(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY enhancements
Date: 2009-10-07 15:39:37
Message-ID: 4ACCB639.7050400@asterdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas wrote:
> On Wed, Oct 7, 2009 at 9:12 AM, Emmanuel Cecchet <manu(at)asterdata(dot)com> wrote:
>
>> Hi all,
>>
>> I think there is a misunderstanding about what the current patch is about.
>> The patch includes 2 things:
>> - error logging in a table for bad tuples in a COPY operation (see
>> http://wiki.postgresql.org/wiki/Error_logging_in_COPY for an example; the
>> error message, command and so on are automatically logged)
>> - auto-partitioning in a hierarchy of child table if the COPY targets a
>> parent table.
>> The patch does NOT include:
>> - logging errors into a file (a feature we can add later on (next commit
>> fest?))
>>
>
> My failure to have read the patch is showing here, but it seems to me
> that error logging to a table could be problematic: if the transaction
> aborts, we'll lose the log. If this is in fact a problem, we should
> be implementing logging to a file (or stdout) FIRST.
>
I don't think this is really a problem. You can always do a SELECT in
the error table after the COPY operation if you want to diagnose what
happened before the transaction rollbacks. I think that using a file to
process the bad tuples is actually less practical than a table if you
want to re-insert these tuples in the database.
But anyway, the current implementation captures the tuple with all the
needed information for logging and delegate the logging to a specific
method. If we want to log to a file (or stdout), we can just provide
another method to log the already captured info in a file.

I think that the file approach is a separate feature but can be easily
integrated in the current design. There is just extra work to make sure
concurrent COPY commands writing to the same error file are using proper
locking.

Emmanuel

--
Emmanuel Cecchet
Aster Data Systems
Web: http://www.asterdata.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2009-10-07 15:42:11 Re: COPY enhancements
Previous Message Alvaro Herrera 2009-10-07 15:29:15 Re: Feature Suggestion: PL/Js