Re: COPY enhancements

From: Emmanuel Cecchet <manu(at)asterdata(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Emmanuel Cecchet <Emmanuel(dot)Cecchet(at)asterdata(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY enhancements
Date: 2009-09-10 23:54:47
Message-ID: 4AA991C7.5060105@asterdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Josh Berkus wrote:
>> I am not really sure why you need a natural key.
>>
>
> a) because we shouldn't be building any features which teach people bad
> db design, and
>
> b) because I will presumably want to purge records from this table
> periodically and doing so without a key is likely to result in purging
> the wrong records.
>
Agreed, but I am not sure that imposing unilaterally a key is going to
suit everyone.
>> By default, the partition_key contains the index of the faulty entry and
>> label the copy command. This could be your key.
>>
>
> Well, you still haven't explained the partition_key to me, so I'm not
> quite clear on that. Help?
>
Please re-read my previous message for the default behavior. If you look
at the example at http://wiki.postgresql.org/wiki/Error_logging_in_COPY,
input_file.txt has 5 rows out of which only 1 and 5 are correct (2, 3
and 4 are bad). The partition key indicates the row that caused the
problem (2 for row 2, 3 for row 3, ...) and label contains the full COPY
statement.
If you want to know how it is used in the Aster product, we will have to
ask Alex who did implement the feature in the product.

> The reason why I'd like to have a session_id or pid or similar is so
> that I can link the copy errors to which backend is erroring in the
> other system views or in the pg_log.
>
> Imagine a system where you have multiple network clients doing COPYs; if
> one of them starts bugging out and all I have is a tablename, filename
> and time, I'm not going to be able to figure out which client is causing
> the problems. The reason I mention this case is that I have a client
> who has a production application like this right now
All the clients are copying the same file to the same table?
I would imagine that every client processes different files and from the
file names it would be easy to identify the faulty client. I am not sure
how you would use the pid to identify the faulty client more easily?

Emmanuel

--
Emmanuel Cecchet
Aster Data Systems
Web: http://www.asterdata.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Sennhauser 2009-09-10 23:55:36 Re: COPY enhancements
Previous Message Josh Berkus 2009-09-10 23:44:16 Re: COPY enhancements