Re: COPY enhancements

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: COPY enhancements
Date: 2009-09-11 22:04:06
Message-ID: 4AAAC956.10908@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg,

> The performance of every path to get data into the database besides COPY
> is too miserable for us to use anything else, and the current
> inflexibility makes it useless for anything but the cleanest input data.

One potential issue we're facing down this road is that current COPY has
a dual purpose: for database restore, and for importing and exporting
data. At some point, we may want to separate those two behaviors,
because we'll be adding bells and fringes to import/export which slow
down overall performance or add bugs.

> The user-defined table for rejects is obviously exclusive of the system
> one, either of those would be fine from my perspective.

I've been thinking about it, and can't come up with a really strong case
for wanting a user-defined table if we settle the issue of having a
strong key for pg_copy_errors. Do you have one?

> I wasn't really pleased with the "if it's not the most general solution
> possible we're not interested" tone of Andrew's other COPY-change thread
> this week.

As someone who uses (and abuses) COPY constantly, I didn't leap at
Andrew's suggestion either because it wasn't *obviously* generally
applicable. We don't want to accept patches which are designed only to
solve the specific problems faced by one user. So for a feature
suggestion as specific as Andrew's, it's worth discussion ... out of
which came some interesting ideas, like copy to TEXT[].

Certainly we're not the project to add "quick hacks" where we can do better.

After some thought, I think that Andrew's feature *is* generally
applicable, if done as IGNORE COLUMN COUNT (or, more likely,
column_count=ignore). I can think of a lot of data sets where column
count is jagged and you want to do ELT instead of ETL. But I had to
give it some thought; as initially presented, the feature seemed very
single-user-specific.

> I don't think there's *that* many common requests here that
> they can't all be handled by specific implementations,

I disagree. That way lies maintenance hell.

> and the scope
> creep of launching into a general framework for adding them is just
> going to lead to nothing useful getting committed.

As opposed to Tom, Peter and Heikki vetoing things because the feature
gain doesn't justify the maintnenance burden? That's your real choice.
Adding a framework for manageable syntax extensions means that we can
be more liberal about what we justify as an extension.

There is a database which allows unrestricted addition of ah-hoc
features. It's called MySQL. They have double the code lines count we
do, and around 100x the outstanding bugs.

> If you want
> something really complicated, drop into a PL-based solution. The stuff
> I list above I see regular requests for at *every* PG installation I've
> ever been involved in, and it would be fantastic if they were available
> out of the box.

I don't think that anyone is talking about not adding this to core.
It's just a question of how we add it. In fact, it's mostly a question
of syntax.

> obviously go away. (The main reason I haven't pushed for us to submit
> our customizations here is that I know perfectly well the GUC-based UI
> isn't acceptable, but I haven't been able to get a better one done yet)

Well, now you can help Aster. ;-)

> If I were reviewing this I'd just
> kick it back as "separate these cleanly into separate patches where the
> partitioning one depends on the logging one" before even starting to
> look at the code, it's too much stuff to consume properly in one gulp.

Well, Bruce was supposed to be helping them submit it. And why *aren't*
you reviewing it?

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2009-09-11 22:10:04 Re: COALESCE and NULLIF semantics
Previous Message Tom Lane 2009-09-11 21:52:50 Re: COPY enhancements