Re: ToDo: pg_backup - using a conditional DROP

Lists: pgsql-hackers
From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: ToDo: pg_backup - using a conditional DROP
Date: 2011-11-15 08:05:45
Message-ID: CAFj8pRAKw6z8=WQ6Rk2npZoNiQwNspcqm3kJNcstvPwNt=NQGA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

there is a request on enhancing of pg_backup to produce a conditional
DROPs. A reason for this request is more simple usage in very dynamic
production - cloud BI solution.

pg_backup can have a new option "--conditional-drops" and then pg_dump
will produce a DROP IF EXISTS statements instead DROP statements.

Ideas, comments?

Regards

Pavel Stehule


From: Torello Querci <tquerci(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ToDo: pg_backup - using a conditional DROP
Date: 2011-11-15 09:31:23
Message-ID: CA+igE6TvsXw3vhXnu5gptDv9=PFa_z0jC=xXJXZqVUeTUq6P=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011/11/15 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
> Hello,
>
> there is a request on enhancing of pg_backup to produce a conditional
> DROPs. A reason for this request is more simple usage in very dynamic
> production - cloud BI solution.
>
> pg_backup can have a new option "--conditional-drops" and then pg_dump
> will produce a DROP IF EXISTS statements instead DROP statements.
>
> Ideas, comments?
>
I think that if there is other way to get the same result in other way
is better to use it without add new options.

In this case, if you are on unix environment, I suppose that you can
use external batch to manipulate the output file, like "sed" I
suppose.
Obviusly this is my personal opinion.

Best Regards
Torello
> Regards
>
> Pavel Stehule
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ToDo: pg_backup - using a conditional DROP
Date: 2011-11-15 14:14:39
Message-ID: 27197.1321366479@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> there is a request on enhancing of pg_backup to produce a conditional
> DROPs. A reason for this request is more simple usage in very dynamic
> production - cloud BI solution.

> pg_backup can have a new option "--conditional-drops" and then pg_dump
> will produce a DROP IF EXISTS statements instead DROP statements.

That is not going to be possible unless we commit to having an IF EXISTS
option for every type of DROP statement, now and in the future.
Even then, it's not apparent to me that it solves any real use-case.
You're probably better off just using --clean and ignoring any errors.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ToDo: pg_backup - using a conditional DROP
Date: 2011-11-15 14:30:05
Message-ID: CAFj8pRA2uu8zDu19gZJSY0vQRbnZjZiRGCE7fp5JEF9hAKydig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011/11/15 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> there is a request on enhancing of pg_backup to produce a conditional
>> DROPs. A reason for this request is more simple usage in very dynamic
>> production - cloud BI solution.
>
>> pg_backup can have a new option "--conditional-drops" and then pg_dump
>> will produce a DROP IF EXISTS statements instead DROP statements.
>
> That is not going to be possible unless we commit to having an IF EXISTS
> option for every type of DROP statement, now and in the future.
> Even then, it's not apparent to me that it solves any real use-case.
> You're probably better off just using --clean and ignoring any errors.
>

ook

Regards

Pavel Stehule

>                        regards, tom lane
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ToDo: pg_backup - using a conditional DROP
Date: 2011-11-15 14:59:17
Message-ID: CA+TgmoarBJAUXgFiB2Vf5ZfX=WT0ZmS_Eo392FD6kK=JHZ=k_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 15, 2011 at 9:14 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> there is a request on enhancing of pg_backup to produce a conditional
>> DROPs. A reason for this request is more simple usage in very dynamic
>> production - cloud BI solution.
>
>> pg_backup can have a new option "--conditional-drops" and then pg_dump
>> will produce a DROP IF EXISTS statements instead DROP statements.
>
> That is not going to be possible unless we commit to having an IF EXISTS
> option for every type of DROP statement, now and in the future.
> Even then, it's not apparent to me that it solves any real use-case.
> You're probably better off just using --clean and ignoring any errors.

Ignoring errors sucks, though, because sometimes you want the whole
thing to succeed or fail as a unit.

I'm wondering why we need an option for this, though. Assuming we
make DROP IF EXISTS work anywhere that it doesn't already, why not
just always produce that rather than straight DROP? It seems
categorically better.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ToDo: pg_backup - using a conditional DROP
Date: 2011-11-15 15:36:44
Message-ID: 1321371052-sup-1604@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Excerpts from Robert Haas's message of mar nov 15 11:59:17 -0300 2011:
> On Tue, Nov 15, 2011 at 9:14 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> >> there is a request on enhancing of pg_backup to produce a conditional
> >> DROPs. A reason for this request is more simple usage in very dynamic
> >> production - cloud BI solution.
> >
> >> pg_backup can have a new option "--conditional-drops" and then pg_dump
> >> will produce a DROP IF EXISTS statements instead DROP statements.
> >
> > That is not going to be possible unless we commit to having an IF EXISTS
> > option for every type of DROP statement, now and in the future.
> > Even then, it's not apparent to me that it solves any real use-case.
> > You're probably better off just using --clean and ignoring any errors.
>
> Ignoring errors sucks, though, because sometimes you want the whole
> thing to succeed or fail as a unit.
>
> I'm wondering why we need an option for this, though. Assuming we
> make DROP IF EXISTS work anywhere that it doesn't already, why not
> just always produce that rather than straight DROP? It seems
> categorically better.

I think there's a fuzzy idea that we should try to keep our dumps
vaguely compatible with other systems. If we add DROP IF EXISTS
unconditionally, there would be no way to make them run elsewhere.

Of course, our dumps already fail for a lot of reasons (for example SET
commands and COPY), but I think if you dump with inserts and COPY and
have the other server ignore errors found while processing the script,
the idea is that you should find that mostly it loads the tables and
data. I don't know how well this principle works for the DROP commands.

I wonder if that instead of trying to remain "somewhat compatible" to
other systems we should instead have a mode specifically designed for
that --one which didn't output SET or backslash commands, used inserts
rather than COPY, etc-- and have the noncompatible mode offer nice
features such as DROP IF EXISTS and the like.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ToDo: pg_backup - using a conditional DROP
Date: 2011-11-15 17:57:31
Message-ID: CA+TgmoYS=iWHTpQOt8MOWR0G=whQpWqJi2S==Vw1uZqR=3ayCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 15, 2011 at 10:36 AM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
>> I'm wondering why we need an option for this, though.  Assuming we
>> make DROP IF EXISTS work anywhere that it doesn't already, why not
>> just always produce that rather than straight DROP?  It seems
>> categorically better.
>
> I think there's a fuzzy idea that we should try to keep our dumps
> vaguely compatible with other systems.  If we add DROP IF EXISTS
> unconditionally, there would be no way to make them run elsewhere.
>
> Of course, our dumps already fail for a lot of reasons (for example SET
> commands and COPY), but I think if you dump with inserts and COPY and
> have the other server ignore errors found while processing the script,
> the idea is that you should find that mostly it loads the tables and
> data.  I don't know how well this principle works for the DROP commands.

Well, except in --clean mode, we don't emit DROP commands at all. And
since --clean doesn't even work well on PostgreSQL, I can't get too
excited about whether it will work everywhere else.

> I wonder if that instead of trying to remain "somewhat compatible" to
> other systems we should instead have a mode specifically designed for
> that --one which didn't output SET or backslash commands, used inserts
> rather than COPY, etc-- and have the noncompatible mode offer nice
> features such as DROP IF EXISTS and the like.

mysqldump has a --compatible=OTHER_DB_SYSTEM flag (postgresql is one
of the choices). That might not be a crazy way to approach the
problem, though possibly we'd want --compatible=FOO to be a shorthand
for a collection of behaviors that could alternatively be selected
individually via appropriately named long options
(--no-backslash-commands, --no-set-commands, etc.).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Christopher Browne <cbbrowne(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ToDo: pg_backup - using a conditional DROP
Date: 2011-11-15 17:58:29
Message-ID: CAFNqd5VPbU2AMLvbcU9evCZvNybErQK8LA69uZ01ow2N47qCFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 15, 2011 at 10:36 AM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
>
> Excerpts from Robert Haas's message of mar nov 15 11:59:17 -0300 2011:
>> On Tue, Nov 15, 2011 at 9:14 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> > Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> >> there is a request on enhancing of pg_backup to produce a conditional
>> >> DROPs. A reason for this request is more simple usage in very dynamic
>> >> production - cloud BI solution.
>> >
>> >> pg_backup can have a new option "--conditional-drops" and then pg_dump
>> >> will produce a DROP IF EXISTS statements instead DROP statements.
>> >
>> > That is not going to be possible unless we commit to having an IF EXISTS
>> > option for every type of DROP statement, now and in the future.
>> > Even then, it's not apparent to me that it solves any real use-case.
>> > You're probably better off just using --clean and ignoring any errors.
>>
>> Ignoring errors sucks, though, because sometimes you want the whole
>> thing to succeed or fail as a unit.
>>
>> I'm wondering why we need an option for this, though.  Assuming we
>> make DROP IF EXISTS work anywhere that it doesn't already, why not
>> just always produce that rather than straight DROP?  It seems
>> categorically better.
>
> I think there's a fuzzy idea that we should try to keep our dumps
> vaguely compatible with other systems.  If we add DROP IF EXISTS
> unconditionally, there would be no way to make them run elsewhere.

Well, it seems to me there's a mixed signal here.

- When operating with Postgres -> Postgres, the suggestions are stuff like

"Oh, you can just ignore these errors"
"Oh, you can just use sed to change things to play better"

I think I'd rather have *some* option here of having there be some
benefit to "PG->PG". I'd rather hear things like...

"OK, if you're using some other database, you can just ignore these errors"
"OK, if you're using some other database that doesn't know about
DROP IF EXISTS, then you can just use sed to fix that"

> Of course, our dumps already fail for a lot of reasons (for example SET
> commands and COPY), but I think if you dump with inserts and COPY and
> have the other server ignore errors found while processing the script,
> the idea is that you should find that mostly it loads the tables and
> data.  I don't know how well this principle works for the DROP commands.

Back in either 8.1 or 8.2, I think it was, we added in a pretty
complete set of "DROP IF EXISTS" commands.

While I am not keen on adding 250 options to pg_dump, I think it's not
the most wonderful thing in the world to need to anticipate failures.

I'd rather have *some* controls that do NOT involve needing to write a
parser of a fragile combination of SQL as generated by pg_dump.

> I wonder if that instead of trying to remain "somewhat compatible" to
> other systems we should instead have a mode specifically designed for
> that --one which didn't output SET or backslash commands, used inserts
> rather than COPY, etc-- and have the noncompatible mode offer nice
> features such as DROP IF EXISTS and the like.

+1 on that, yeah.

The part that'll be nasty-ish is the question of to what degree we'd
like to be cross-PG-version compatible.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ToDo: pg_backup - using a conditional DROP
Date: 2011-11-15 18:53:33
Message-ID: m2ehx98bf6.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I wonder if that instead of trying to remain "somewhat compatible" to
>> other systems we should instead have a mode specifically designed for
>> that --one which didn't output SET or backslash commands, used inserts
>> rather than COPY, etc-- and have the noncompatible mode offer nice
>> features such as DROP IF EXISTS and the like.
>
> mysqldump has a --compatible=OTHER_DB_SYSTEM flag (postgresql is one
> of the choices). That might not be a crazy way to approach the
> problem, though possibly we'd want --compatible=FOO to be a shorthand
> for a collection of behaviors that could alternatively be selected
> individually via appropriately named long options
> (--no-backslash-commands, --no-set-commands, etc.).

I can't help but recalling Hannu's lightning talk at pgconf.eu in
Amsterdam last month. What about implementing mysql protocol and syntax
instead, so that users would just use mysqldump here, if that's the
format they do want to play with.

Not the same scale of work, but opening our infrastructure to multiple
syntaxes and protocols could be something to aim for. Think memcache
protocol backed by hstore or even plv8, too.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ToDo: pg_backup - using a conditional DROP
Date: 2011-11-16 03:27:19
Message-ID: 10184.1321414039@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Nov 15, 2011 at 10:36 AM, Alvaro Herrera
> <alvherre(at)commandprompt(dot)com> wrote:
>>> I'm wondering why we need an option for this, though. Assuming we
>>> make DROP IF EXISTS work anywhere that it doesn't already, why not
>>> just always produce that rather than straight DROP? It seems
>>> categorically better.

>> I think there's a fuzzy idea that we should try to keep our dumps
>> vaguely compatible with other systems. If we add DROP IF EXISTS
>> unconditionally, there would be no way to make them run elsewhere.

> Well, except in --clean mode, we don't emit DROP commands at all. And
> since --clean doesn't even work well on PostgreSQL, I can't get too
> excited about whether it will work everywhere else.

What I find lacking here is a clear explication of what the use-case is;
that is, this proposal seems like a solution in search of a problem.

The default case for pg_dump is that you're going to load a bunch of
objects into an empty database. You don't need any DROP commands,
and this always works fine (or if it doesn't, there's a clear bug to
be fixed in pg_dump).

The --clean switch seems to be targeted at the case that you're trying
to replace the contents of a database that has the same schema as the
one you dumped from. The DROPs will work, more or less, barring nasty
cases such as circular dependencies. (Maybe it will work even then,
but I don't know how carefully we've tested such cases.)

Now, --clean using DROP IF EXISTS would be targeted at, um, what case?
I guess the idea is to be able to load into a database that sort of
mostly shares the same schema as the one you dumped from, only it's not
the same (if it were the same, you'd not need IF EXISTS). The problem
with this is that if the schema isn't the same, it probably hasn't got
the same dependencies, so there's rather little likelihood that pg_dump
will correctly guess what order to issue the DROPs in ... and it
certainly won't know about dependencies to the target objects from other
objects that are in the destination database but weren't in the source.

Possibly you could get around that by ignoring errors; but if you're
willing to ignore errors, you don't need the IF EXISTS qualifiers.

So before buying into this proposal, I want to see a clear demonstration
of a common use-case where it actually does some good. I'm not
convinced that there is one.

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ToDo: pg_backup - using a conditional DROP
Date: 2011-11-16 14:42:46
Message-ID: 1321454566.30837.1.camel@fsopti579.F-Secure.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On tis, 2011-11-15 at 22:27 -0500, Tom Lane wrote:
> Now, --clean using DROP IF EXISTS would be targeted at, um, what case?
> I guess the idea is to be able to load into a database that sort of
> mostly shares the same schema as the one you dumped from, only it's
> not the same (if it were the same, you'd not need IF EXISTS).

What about a schema that is nominally the same, but some object is
missing, because some earlier attempt to clean things up, or because
it's an earlier version of the schema, or because of some batch jobs
make things come and go.


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ToDo: pg_backup - using a conditional DROP
Date: 2011-11-21 14:05:12
Message-ID: CAFj8pRC_4tgCVgyHLnGjOZ3j1WkhQ4XvCqx_OhTKc=oQWgrLqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

There is a use case from GoodData's engineer

---
We have many user projects, each project has his own database with
granted permissions.
We use pg_dump with option "--clean" which extends SQL dump with syntax like:
"DROP INDEX tab1_idx1;"
"DROP TABLE tab1;"

When we load such dump into database we get a lot of error messages saying:
"INDEX tab1_idx1 doesn't exists;"
"TABLE tab1 doesn't exists;"

We need "--clean" parameter, because we often need to load dump into
database which is not empty.
On the other hand, we want to get rid of ERROR messages in case some
of tables are missing.

Typically we are need to solve one of following situations:

1, dump user project, restore the data in new project.
Each project, even empty one contains at least table "const".
During load of SQL dump, const table from dump needs to overwrite the
one in database.

2, restore the data in project after fail of previous restoration
E.g. connection lost during restoration, no disc space left on device, ...

3, restore the data in project from backup
For some reason, data in the project are messed up and they need to be
overwritten from backup

Vlada
Vladimír Vacula (GoodData)

---

Regards

Pavel Stehule