Re: 9.2beta1 regression: pg_restore --data-only does not set sequence values any more

Lists: pgsql-bugspgsql-hackers
From: Martin Pitt <mpitt(at)debian(dot)org>
To: PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: 9.2beta1 regression: pg_restore --data-only does not set sequence values any more
Date: 2012-05-15 12:25:00
Message-ID: 20120515122500.GD3041@piware.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Hello all,

while packaging 9.2 beta 1 for Debian/Ubuntu the postgresql-common
test suite noticed a regression: It seems that pg_restore --data-only
now skips the current value of sequences, so that in the upgraded
database the sequence counter is back to the default. This can lead to
rather serious data errors.

Reproducer:

* Create a db and sequence:

$ createdb test
$ psql test -c "CREATE SEQUENCE odd10 INCREMENT BY 2 MINVALUE 1 MAXVALUE 10 CYCLE"
CREATE SEQUENCE

* Advance it two steps:

$ psql --cluster 9.2/main -Atc "SELECT nextval('odd10')" test
1
$ psql --cluster 9.2/main -Atc "SELECT nextval('odd10')" test
3

* Dump schema and "test" data:

$ pg_dumpall -s > /tmp/schema
$ pg_dump -Fc test > /tmp/test.dump

* Drop db:

$ dropdb test

* Restore DB:

$ psql template1 < /tmp/schema
$ pg_restore --data-only -d postgres /tmp/test.dump

* Check the counter:

$ psql -Atc "SELECT nextval('odd10')" test
1

This is wrong, it should be "5", from the original database. This
worked all the way up to 9.1.

With "pg_restore /tmp/test.dump" (i. e. a full dump), you see

SELECT pg_catalog.setval('odd10', 3, true);

in the dump, but it is not present in the output of
"pg_restore --data-only /tmp/test.dump". However, when I use the 9.1
version, it works:

$ /usr/lib/postgresql/9.1/bin/pg_restore --data-only /tmp/test.dump | grep setval
SELECT pg_catalog.setval('odd10', 3, true);

Thanks!

Martin

--
Martin Pitt | http://www.piware.de
Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martin Pitt <mpitt(at)debian(dot)org>
Cc: PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: 9.2beta1 regression: pg_restore --data-only does not set sequence values any more
Date: 2012-05-16 13:08:26
Message-ID: 23209.1337173706@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Martin Pitt <mpitt(at)debian(dot)org> writes:
> while packaging 9.2 beta 1 for Debian/Ubuntu the postgresql-common
> test suite noticed a regression: It seems that pg_restore --data-only
> now skips the current value of sequences, so that in the upgraded
> database the sequence counter is back to the default.

I believe this is a consequence of commit
a4cd6abcc901c1a8009c62a27f78696717bb8fe1, which introduced the entirely
false assumption that --schema-only and --data-only have something to
do with the order that entries appear in the archive ...

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Martin Pitt <mpitt(at)debian(dot)org>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: 9.2beta1 regression: pg_restore --data-only does not set sequence values any more
Date: 2012-05-16 14:23:07
Message-ID: CAD5tBc+L3=MEz3KCjwrgcR82hU7qrm6qUZdCs89VGyQj-WgPiw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Wed, May 16, 2012 at 9:08 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Martin Pitt <mpitt(at)debian(dot)org> writes:
> > while packaging 9.2 beta 1 for Debian/Ubuntu the postgresql-common
> > test suite noticed a regression: It seems that pg_restore --data-only
> > now skips the current value of sequences, so that in the upgraded
> > database the sequence counter is back to the default.
>
> I believe this is a consequence of commit
> a4cd6abcc901c1a8009c62a27f78696717bb8fe1, which introduced the entirely
> false assumption that --schema-only and --data-only have something to
> do with the order that entries appear in the archive ...
>
>
>

Darn, will investigate.

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Martin Pitt <mpitt(at)debian(dot)org>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 9.2beta1 regression: pg_restore --data-only does not set sequence values any more
Date: 2012-05-21 18:59:34
Message-ID: 4FBA9096.30107@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On 05/16/2012 10:23 AM, Andrew Dunstan wrote:
>
>
> On Wed, May 16, 2012 at 9:08 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us
> <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>> wrote:
>
> Martin Pitt <mpitt(at)debian(dot)org <mailto:mpitt(at)debian(dot)org>> writes:
> > while packaging 9.2 beta 1 for Debian/Ubuntu the postgresql-common
> > test suite noticed a regression: It seems that pg_restore
> --data-only
> > now skips the current value of sequences, so that in the upgraded
> > database the sequence counter is back to the default.
>
> I believe this is a consequence of commit
> a4cd6abcc901c1a8009c62a27f78696717bb8fe1, which introduced the
> entirely
> false assumption that --schema-only and --data-only have something to
> do with the order that entries appear in the archive ...
>
>
>
> Darn, will investigate.
>
>

[cc -hackers]

Well, the trouble is that we have these pesky SECTION_NONE entries for
things like comments, security labels and ACLs that need to be dumped in
the right section, so we can't totally ignore the order. But we could
(and probably should) ignore the order for making decisions about
everything BUT those entries.

So, here's a revised plan:

--section=data will dump exactly TABLE DATA, SEQUENCE SET or BLOBS
entries
--section=pre-data will dump SECTION_PRE_DATA items (other than
SEQUENCE SET) plus any immediately following SECTION_NONE items.
--section=post-data will dump everything else.

Comments?

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Martin Pitt <mpitt(at)debian(dot)org>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Re: 9.2beta1 regression: pg_restore --data-only does not set sequence values any more
Date: 2012-05-25 18:21:59
Message-ID: 4FBFCDC7.8070209@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On 05/21/2012 02:59 PM, Andrew Dunstan wrote:
>
>
> On 05/16/2012 10:23 AM, Andrew Dunstan wrote:
>>
>>
>> On Wed, May 16, 2012 at 9:08 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us
>> <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>> wrote:
>>
>> Martin Pitt <mpitt(at)debian(dot)org <mailto:mpitt(at)debian(dot)org>> writes:
>> > while packaging 9.2 beta 1 for Debian/Ubuntu the postgresql-common
>> > test suite noticed a regression: It seems that pg_restore
>> --data-only
>> > now skips the current value of sequences, so that in the upgraded
>> > database the sequence counter is back to the default.
>>
>> I believe this is a consequence of commit
>> a4cd6abcc901c1a8009c62a27f78696717bb8fe1, which introduced the
>> entirely
>> false assumption that --schema-only and --data-only have
>> something to
>> do with the order that entries appear in the archive ...
>>
>>
>>
>> Darn, will investigate.
>>
>>
>
> [cc -hackers]
>
> Well, the trouble is that we have these pesky SECTION_NONE entries for
> things like comments, security labels and ACLs that need to be dumped
> in the right section, so we can't totally ignore the order. But we
> could (and probably should) ignore the order for making decisions
> about everything BUT those entries.
>
> So, here's a revised plan:
>
> --section=data will dump exactly TABLE DATA, SEQUENCE SET or BLOBS
> entries
> --section=pre-data will dump SECTION_PRE_DATA items (other than
> SEQUENCE SET) plus any immediately following SECTION_NONE items.
> --section=post-data will dump everything else.
>
>

It turns out there were some infelicities with pg_dump as well as with
pg_restore.

I think the attached patch does the right thing. I'll keep testing -
I'll be happier if other people bang on it too.

cheers

andrew

Attachment Content-Type Size
sectionfix.patch text/x-patch 3.3 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Martin Pitt <mpitt(at)debian(dot)org>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Re: 9.2beta1 regression: pg_restore --data-only does not set sequence values any more
Date: 2012-05-29 22:05:25
Message-ID: 426.1338329125@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>>> Martin Pitt <mpitt(at)debian(dot)org <mailto:mpitt(at)debian(dot)org>> writes:
>>>> while packaging 9.2 beta 1 for Debian/Ubuntu the postgresql-common
>>>> test suite noticed a regression: It seems that pg_restore --data-only
>>>> now skips the current value of sequences, so that in the upgraded
>>>> database the sequence counter is back to the default.

> It turns out there were some infelicities with pg_dump as well as with
> pg_restore.

> I think the attached patch does the right thing. I'll keep testing -
> I'll be happier if other people bang on it too.

After looking this over, I think the original patch was just
fundamentally wrong and needs to be largely rewritten. The basic
error was in saying that the existing options --schema-only and
--data-only were equivalent to particular cases of --section, which
they are not. The proposed new patch does not make this better;
it just makes the logic even more spaghetti-ish.

I think what we need is to rip all that out and treat --section as being
a new option that's not tied to the old ones in any way, but is an
entirely orthogonal filter. The right place to implement it (for either
pg_dump or pg_restore) is in the TOC-scanning loops in
pg_backup_archiver.c, which can track which section they are in fairly
easily (probably define it as being the current item's section unless
that's SECTION_NONE, in which case use the previous section value).

BTW, I'm thinking we could make that code simpler and faster if the
_tocEntryRequired logic were done only once in an initial pass, and then
we stored the teReqs result into a work field in the TocEntry struct
for use in later passes.

Andrew told me off-list that he would be unavailable due to travel for
awhile, so I will have a go at fixing this.

regards, tom lane