pg_restore silently chokes on object comments/descriptions ending in a backslash

Lists: pgsql-bugs
From: Julian Mehnle <julian(at)mehnle(dot)net>
To: pgsql-bugs(at)postgresql(dot)org
Subject: pg_restore silently chokes on object comments/descriptions ending in a backslash
Date: 2011-07-27 00:42:18
Message-ID: 201107270042.22427.julian@mehnle.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

I observe the following issue on PostgreSQL 9.0.4 on at least the
following platforms:

* FreeBSD 6.3 (amd64)
`uname -a`:
FreeBSD <hostname> 6.3-STABLE FreeBSD 6.3-STABLE #1: Fri May 30 18:11:47 PDT 2008
root@<hostname>:/data/obj/data/home/<username>/symbols/builddir_amd64/usr/src/sys/MESSAGING_GATEWAY.amd64_INSTALL amd64

* Mac OS X 10.6.8 (i386)
`uname -a`:
Darwin joule 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun 7 16:33:36 PDT 2011; root:xnu-1504.15.3~1/RELEASE_I386 i386 i386

* semi-current Debian testing (amd64)
`uname -a`:
Linux gray 2.6.30-2-amd64 #1 SMP Mon Dec 7 05:21:45 UTC 2009 x86_64 GNU/Linux

If the comment/description of a database object (table, function, etc.)
ends in a backslash (which generally works fine otherwise), then
pg_restore is unable to completely restore a custom-format dump of the
schema. pg_restore does not complain, but silently(!) stops issuing DDL
statements to the server starting with the first "COMMENT ON …" statement
that would have set an object comment/description ending in a backslash.

Reproduce as follows:

$ createdb test0
$ createdb test1
$ psql -c "CREATE TABLE bar (); COMMENT ON TABLE bar IS 'bar\\';" test0
COMMENT
$ psql -c "CREATE TABLE foo (); COMMENT ON TABLE foo IS 'foo';" test0
COMMENT
$ pg_dump --format custom --file test0.pg_dump --schema-only test0
$ pg_restore -d test1 test0.pg_dump
$ psql -c '\dt+' test0
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------+-------+--------+---------+-------------
public | bar | table | julian | 0 bytes | bar\
public | foo | table | julian | 0 bytes | foo
(2 rows)

$ psql -c '\dt+' test1
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------+-------+--------+---------+-------------
public | bar | table | julian | 0 bytes |
(1 row)

This also happens with PostgreSQL 8.4.

To demonstrate that this is not an academic issue, these are a few
functions I have defined, and their comments:

List of functions
-[ RECORD 1 ]-------+------------------------------------------------------------------------
Schema | public
Name | escape_are
... : ...
Description | escape advanced regexp (ARE) special characters: .*+?|[](){}^$\
-[ RECORD 2 ]-------+------------------------------------------------------------------------
Schema | public
Name | escape_control
... : ...
Description | escape control characters: \a\b\t\n\v\f\r\e\\
-[ RECORD 3 ]-------+------------------------------------------------------------------------
Schema | public
Name | escape_like
... : ...
Description | escape LIKE pattern special characters: %_\

I have worked around the issue by appending a space character to each of
those function descriptions. What makes the problem really bad is that it
silently renders your custom-format database dumps (which pg_dump creates
just fine) useless, which you notice only after you do a restore (without
an error being thrown) and your restored database being incomplete.

-Julian


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Julian Mehnle <julian(at)mehnle(dot)net>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: pg_restore silently chokes on object comments/descriptions ending in a backslash
Date: 2011-07-27 03:27:14
Message-ID: 23791.1311737234@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Julian Mehnle <julian(at)mehnle(dot)net> writes:
> If the comment/description of a database object (table, function, etc.)
> ends in a backslash (which generally works fine otherwise), then
> pg_restore is unable to completely restore a custom-format dump of the
> schema.

Reproduced here against HEAD. The problem seems to be that
pg_backup_db.c's _sendSQLLine() contains a mini SQL lexer that is not
cognizant of standard_conforming_strings. Not sure about a simple fix,
and I rather wonder if we shouldn't try to remove that code entirely
instead of "fix" it.

As a temporary workaround, the SQL text file that pg_restore produces
by default seems to be valid, so you could pipe that into psql.

regards, tom lane


From: Julian Mehnle <julian(at)mehnle(dot)net>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: pg_restore silently chokes on object comments/descriptions ending in a backslash
Date: 2011-07-27 04:01:57
Message-ID: 201107270402.02561.julian@mehnle.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

I'm subscribed to the pgsql-bugs list, so no need to CC me. :-)

Tom Lane wrote:

> Reproduced here against HEAD. The problem seems to be that
> pg_backup_db.c's _sendSQLLine() contains a mini SQL lexer that is not
> cognizant of standard_conforming_strings.

Oh, right, I forgot to mention I have standard_conforming_strings enabled.
However, I understand that is to become the new default anyway.

> Not sure about a simple fix,
> and I rather wonder if we shouldn't try to remove that code entirely
> instead of "fix" it.

What would "removing that code entirely" mean?

> As a temporary workaround, the SQL text file that pg_restore produces
> by default seems to be valid, so you could pipe that into psql.

Hmm, right. So at least my existing dumps aren't useless. Thanks.

-Julian


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Julian Mehnle <julian(at)mehnle(dot)net>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: pg_restore silently chokes on object comments/descriptions ending in a backslash
Date: 2011-07-27 14:35:19
Message-ID: 7994.1311777319@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Julian Mehnle <julian(at)mehnle(dot)net> writes:
> I'm subscribed to the pgsql-bugs list, so no need to CC me. :-)

cc to people in the thread is the established practice on these lists.
It provides a bit more robustness when the lists are busy or slow.
You can set your subscription so the listserv won't send you an extra
copy, I believe, but I don't know the incantation offhand.

> Tom Lane wrote:
>> Not sure about a simple fix,
>> and I rather wonder if we shouldn't try to remove that code entirely
>> instead of "fix" it.

> What would "removing that code entirely" mean?

I was wondering why it's necessary to parse the entry in the dump file
at all, rather than just spit it out to PQexec as-is. There's probably
a reason, but maybe we can find another way to solve whatever the
real problem is.

regards, tom lane


From: Julian Mehnle <julian(at)mehnle(dot)net>
To: pgsql-bugs(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: pg_restore silently chokes on object comments/descriptions ending in a backslash
Date: 2011-07-27 17:28:21
Message-ID: 201107271728.25202.julian@mehnle.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane wrote:

> Julian Mehnle <julian(at)mehnle(dot)net> writes:
> > I'm subscribed to the pgsql-bugs list, so no need to CC me. :-)
>
> cc to people in the thread is the established practice on these lists.
> It provides a bit more robustness when the lists are busy or slow.

Got it. I'll try to remember that and CC others when posting, and will
also not take offense when receiving extra CCs.

> > What would "removing that code entirely" mean?
>
> I was wondering why it's necessary to parse the entry in the dump file
> at all, rather than just spit it out to PQexec as-is. There's probably
> a reason, but maybe we can find another way to solve whatever the
> real problem is.

I see. Given that I'm only a Pg user, not a Pg developer, I have no
opinion on this. Thanks for the explanation, though!

-Julian


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Julian Mehnle <julian(at)mehnle(dot)net>
Cc: Pg Bugs <pgsql-bugs(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: pg_restore silently chokes on object comments/descriptions ending in a backslash
Date: 2011-07-27 17:44:28
Message-ID: 1311788457-sup-6543@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Excerpts from Julian Mehnle's message of mié jul 27 13:28:21 -0400 2011:
> Tom Lane wrote:
>
> > Julian Mehnle <julian(at)mehnle(dot)net> writes:
> > > I'm subscribed to the pgsql-bugs list, so no need to CC me. :-)
> >
> > cc to people in the thread is the established practice on these lists.
> > It provides a bit more robustness when the lists are busy or slow.
>
> Got it. I'll try to remember that and CC others when posting, and will
> also not take offense when receiving extra CCs.

If you're on procmail, you can get very easily use a well-known de-dupe
recipe and it all works wonderfully.

:0 Wh: msgid.lock
| formail -D 65536 $HOME/.msgid.cache

I assume similar tools can use equivalent mechanisms.

Note that CCing others is customary but obviously not mandatory.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Julian Mehnle <julian(at)mehnle(dot)net>
To: pgsql-bugs(at)postgresql(dot)org
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: pg_restore silently chokes on object comments/descriptions ending in a backslash
Date: 2011-07-27 18:30:09
Message-ID: 201107271830.10322.julian@mehnle.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Alvaro Herrera wrote:

> Note that CCing others is customary but obviously not mandatory.

For the record, CCing posters who haven't explicitly requested it is
frowned upon on the Debian mailing lists , but apparently those have a
lower latency than the Pg ones. :-)

I'll shut up now since this is drifting off topic.

-Julian


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Julian Mehnle <julian(at)mehnle(dot)net>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: pg_restore silently chokes on object comments/descriptions ending in a backslash
Date: 2011-07-28 18:10:23
Message-ID: 1908.1311876623@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

I wrote:
> Julian Mehnle <julian(at)mehnle(dot)net> writes:
>> What would "removing that code entirely" mean?

> I was wondering why it's necessary to parse the entry in the dump file
> at all, rather than just spit it out to PQexec as-is. There's probably
> a reason, but maybe we can find another way to solve whatever the
> real problem is.

I've applied a patch along those lines. Thanks for the report!

regards, tom lane