Re: COPY enhancements

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Emmanuel Cecchet <manu(at)asterdata(dot)com>, Selena Deckelmann <selenamarie(at)gmail(dot)com>, Emmanuel Cecchet <Emmanuel(dot)Cecchet(at)asterdata(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY enhancements
Date: 2009-10-07 07:17:20
Message-ID: alpine.GSO.2.01.0910070255190.16948@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 5 Oct 2009, Josh Berkus wrote:

>> I think that this was the original idea but we should probably rollback
>> the error logging if the command has been rolled back. It might be more
>> consistent to use the same hi_options as the copy command. Any idea what
>> would be best?
>
> Well, if we're logging to a file, you wouldn't be *able* to roll them
> back. Also, presumbly, if you abort a COPY because of errors, you
> probably want to keep the errors around for later analysis. No?

Absolutely, that's the whole point of logging to a file in the first
place. What needs to happen here is that when one is aborted, you need to
make sure that fact is logged, and with enough information (the pid?) to
tie it to the COPY that failed. Then someone can crawl the logs to figure
out what happened and what data did and didn't get loaded. Ideally you'd
want to have that as database table information instead, to allow
automated recovery and re-commit in the cases where the error wasn't
inherent in the data but instead some other type of database failure.

I know this patch is attracting more reviewers lately, is anyone tracking
the general architecture of the code yet? Emmanuel's work is tough to
review just because there's so many things mixed together, and there's
other inputs I think should be considered at the same time while we're all
testing in there (such as the COPY patch Andrew Dunstan put together).

What I'd like to see is for everything to get broken more into component
chunks that can get commited and provide something useful one at a time,
because I doubt taskmaster Robert is going to let this one linger around
with scope creep for too long before being pushed out to the next
CommitFest. It would be nice to have a clear game plan that involves the
obvious 3 smaller subpatches that can be commited one at a time as they're
ready to focus the work on, so that something might be polished enough for
this CF. And, yes, I'm suggesting work only because I now have some time
to help with that if the idea seems reasonable to persue. Be glad to set
up a public git repo or something to serve as an integration point for
dependency merge management and testing that resists bit-rot while
splitting things up functionally.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kiswono Prayogo 2009-10-07 08:00:43 Feature Suggestion: PL/Js
Previous Message Itagaki Takahiro 2009-10-07 02:28:08 Re: Encoding issues in console and eventlog on win32