Re: pg_dump insert with column names speedup

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump insert with column names speedup
Date: 2013-10-04 23:52:31
Message-ID: CAApHDvq7qxY3FwHHuFiHw0q_Qdqou71cX6eh8Njun5ym0NqwHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Oct 5, 2013 at 6:39 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> > On Fri, Oct 4, 2013 at 11:28 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > > David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> > >> Here's a small patch which greatly increases the speed of
> > >> pg_dump --column-inserts.
> > >
> > > The reason why no one's paid any attention to the speed of that code
> path
> > > is that if you care about dump/restore speed, you should be using the
> COPY
> > > code paths instead. Is it really worth adding code and complexity to
> > > pg_dump for this?
> >
> > One possible reason to care about this is if you're trying to move
> > data to another database. The INSERT format is more portable.
> >
> > Also, this isn't really adding any net code or complexity AFAICS.
>
> Agreed- this looks more like a "gee, that makes a lot of sense" than a
> "wow, that's way more complicated". Not a whole lot of point in
> building up a known-to-be-constant string on every iteration of the
> loop.
>
>
These words made me think that the changes I made were not quite enough to
satisfy this what you said.
I understand that most people won't use the --column-inserts feature, but
that's not really a great reason to have not very clever and wasteful code
in there. This fruit was so low hanging it was pretty much touching the
ground, so I couldn't resist fixing it when I saw it.

The attached revised patch goes a little further and prepares everything
that is constant on processing the first row, this now includes the "INSERT
INTO tablename " part. I don't think the changes make the code any harder
to read, the code which builds the staticStmt fits into my small laptop
screen.

The timings with my benchmark look something like:

Unpatched: 9200 ms
Version 0.1: 5700 ms
Version 0.2: 5250 ms

So it does shave off a bit more, for what it's worth.

David

> Thanks,
>
> Stephen
>

Attachment Content-Type Size
pg_dump_colinsert_v0.2.patch application/octet-stream 3.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kohei KaiGai 2013-10-05 08:01:07 Re: Custom Plan node
Previous Message Pavel Stehule 2013-10-04 21:18:44 Re: ToDo: fast update of arrays with fixed length fields for PL/pgSQL