From: | jian he <jian(dot)universality(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Daniel Gustafsson <daniel(at)yesql(dot)se>, Damir <dam(dot)bel07(at)gmail(dot)com>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, anisimow(dot)d(at)gmail(dot)com, HukuToc(at)gmail(dot)com, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru> |
Subject: | Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) |
Date: | 2023-11-16 00:00:00 |
Message-ID: | CACJufxG31R-L7HZ3u3QDPLY8GLmvYjvD-o-+mj5Z2ArMvzCZQw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Nov 9, 2023 at 4:12 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Daniel Gustafsson <daniel(at)yesql(dot)se> writes:
> >> On 8 Nov 2023, at 19:18, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> I think an actually usable feature of this sort would involve
> >> copying all the failed lines to some alternate output medium,
> >> perhaps a second table with a TEXT column to receive the original
> >> data line. (Or maybe an array of text that could receive the
> >> broken-down field values?) Maybe we could dump the message info,
> >> line number, field name etc into additional columns.
>
> > I agree that the errors should be easily visible to the user in some way. The
> > feature is for sure interesting, especially in data warehouse type jobs where
> > dirty data is often ingested.
>
> I agree it's interesting, but we need to get it right the first time.
>
> Here is a very straw-man-level sketch of what I think might work.
> The option to COPY FROM looks something like
>
> ERRORS TO other_table_name (item [, item [, ...]])
>
> where the "items" are keywords identifying the information item
> we will insert into each successive column of the target table.
> This design allows the user to decide which items are of use
> to them. I envision items like
>
> LINENO bigint COPY line number, counting from 1
> LINE text raw text of line (after encoding conversion)
> FIELDS text[] separated, de-escaped string fields (the data
> that was or would be fed to input functions)
> FIELD text name of troublesome field, if field-specific
> MESSAGE text error message text
> DETAIL text error message detail, if any
> SQLSTATE text error SQLSTATE code
>
just
SAVE ERRORS
automatically create a table to hold the error. (validate
auto-generated table name uniqueness, validate create privilege).
and the table will have the above related info. if no error then table
gets dropped.
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2023-11-16 00:46:08 | Re: Tab completion for CREATE TABLE ... AS |
Previous Message | Roberto Mello | 2023-11-15 23:28:24 | Re: Add minimal C example and SQL registration example for custom table access methods. |