[PATCH] pg_dump: Sort overloaded functions in deterministic order

Lists: pgsql-hackers
From: Joel Jacobson <joel(at)trustly(dot)com>
To: pgsql-hackers(at)postgresql(dot)org, Joel Jacobson <joel(at)trustly(dot)com>
Subject: [PATCH] pg_dump: Sort overloaded functions in deterministic order
Date: 2012-07-04 16:44:54
Message-ID: CAASwCXf9x67DsTp_N44N8L1q7LiyTJeHoKK-8DT0ykYr+J-CaA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I renamed the new element to DumpableObject from "proargs" to the more
general name "sortkey".

This way this element can be used by any object types in the future,
which might require sorting by additional information than type, namespace
and name.

Currently, it's only set for functions/aggregates though, its NULL for all
other object types.

I felt less ugly to add a new element with a general name than one specific
for functions.

I also moved the check to the last part of DOTypeNameCompare, just before
sorting by OIDs as a last resort.

Feedback on the implementation is welcomed.

If this can be achieved without adding a new element to DumpableObject,
it is of course much better, but I couldn't find a way of doing that.

Attachment Content-Type Size
pg_dump_deterministic_order_v2.patch application/octet-stream 20.2 KB

From: Joel Jacobson <joel(at)trustly(dot)com>
To: pgsql-hackers(at)postgresql(dot)org, Joel Jacobson <joel(at)trustly(dot)com>
Subject: Re: [PATCH] pg_dump: Sort overloaded functions in deterministic order
Date: 2012-07-05 13:07:10
Message-ID: CAASwCXdRPFER_xTxmOESomiDeLYXhp1kfpn_gvLd3t1vjv-fAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

New version, made a typo in last one.

Attachment Content-Type Size
pg_dump_deterministic_order_v3.patch application/octet-stream 20.2 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joel Jacobson <joel(at)trustly(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] pg_dump: Sort overloaded functions in deterministic order
Date: 2012-07-05 18:29:54
Message-ID: 18078.1341512994@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Joel Jacobson <joel(at)trustly(dot)com> writes:
> New version, made a typo in last one.

I'm not particularly happy with the idea of adding a sortkey field to
DumpableObject as such, when most object types don't need it. That just
bloats the code and pg_dump's memory consumption. It would be better to
modify the already-existing object-type-specific special cases in
DOTypeNameCompare to take additional information into account as needed.

BTW, I see no reason to be adding extra calls of
pg_get_function_identity_arguments. What is wrong with the funcsig or
aggsig strings that the code already computes?

regards, tom lane


From: Joel Jacobson <joel(at)trustly(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] pg_dump: Sort overloaded functions in deterministic order
Date: 2012-07-05 18:35:41
Message-ID: CAASwCXdtEWWGzX=Egw--prWcWaoj_EYMBiwMC161XBxHsicT_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I agree, good suggestion, I just didn't know how to implement it without a
new field. I'll make a new attempt to get it right.

On Thursday, July 5, 2012, Tom Lane wrote:

> Joel Jacobson <joel(at)trustly(dot)com <javascript:;>> writes:
> > New version, made a typo in last one.
>
> I'm not particularly happy with the idea of adding a sortkey field to
> DumpableObject as such, when most object types don't need it. That just
> bloats the code and pg_dump's memory consumption. It would be better to
> modify the already-existing object-type-specific special cases in
> DOTypeNameCompare to take additional information into account as needed.
>
> BTW, I see no reason to be adding extra calls of
> pg_get_function_identity_arguments. What is wrong with the funcsig or
> aggsig strings that the code already computes?
>
> regards, tom lane
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joel Jacobson <joel(at)trustly(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] pg_dump: Sort overloaded functions in deterministic order
Date: 2012-07-05 20:33:30
Message-ID: 25642.1341520410@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Joel Jacobson <joel(at)trustly(dot)com> writes:
> I agree, good suggestion, I just didn't know how to implement it without a
> new field. I'll make a new attempt to get it right.

You may in fact need a new field --- I'm just saying it should be in the
object-type-specific struct, eg FuncInfo, not DumpableObject.

regards, tom lane


From: Joel Jacobson <joel(at)trustly(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] pg_dump: Sort overloaded functions in deterministic order
Date: 2012-07-05 20:36:41
Message-ID: CAASwCXcoQAuV8MafNLda5qC3bqFWOFHYON1fngkr79fDmM1j7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Roger that. I'm on it.

On Thursday, July 5, 2012, Tom Lane wrote:

> Joel Jacobson <joel(at)trustly(dot)com <javascript:;>> writes:
> You may in fact need a new field --- I'm just saying it should be in the
> object-type-specific struct, eg FuncInfo, not DumpableObject.
>
> regards, tom lane
>


From: Joel Jacobson <joel(at)trustly(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] pg_dump: Sort overloaded functions in deterministic order
Date: 2012-07-06 12:04:24
Message-ID: CAASwCXdHAn72bYusMF0CjC4Zr=H+ExdFCd1tSph3cb24Tefo4w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jul 5, 2012 at 10:33 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> You may in fact need a new field --- I'm just saying it should be in the
> object-type-specific struct, eg FuncInfo, not DumpableObject.

I suggest adding char *funcsig to FuncInfo, and moving the "funcsig =
format_function_arguments(finfo, funciargs)" code from dumpFunc to getFuncs.

Because dumpFunc is called after sortDumpableObjectsByTypeName, setting
funcsig in the FuncInfo struct in dumpFunc would't work, as it needs to be
available when entering sortDumpableObjectsByTypeName.

What do you think?


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Joel Jacobson <joel(at)trustly(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] pg_dump: Sort overloaded functions in deterministic order
Date: 2012-10-17 21:43:51
Message-ID: 20121017214350.GN5217@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Joel Jacobson wrote:
> On Thu, Jul 5, 2012 at 10:33 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> > You may in fact need a new field --- I'm just saying it should be in the
> > object-type-specific struct, eg FuncInfo, not DumpableObject.
>
>
> I suggest adding char *funcsig to FuncInfo, and moving the "funcsig =
> format_function_arguments(finfo, funciargs)" code from dumpFunc to getFuncs.
>
> Because dumpFunc is called after sortDumpableObjectsByTypeName, setting
> funcsig in the FuncInfo struct in dumpFunc would't work, as it needs to be
> available when entering sortDumpableObjectsByTypeName.

Uh, the patch you posted keeps the pg_get_function_identity_arguments
call in dumpFunc, but there is now also a new one in getFuncs. Do we
need to remove the second one?

Here's an updated patch for your consideration. I was about to push
this when I noticed the above. The only change here is that the extra
code that tests for new remoteVersions in the second "else if" branch of
getFuncs and getAggregates has been removed, since it cannot ever be
reached.

(I tested the new pg_dump with 8.2 and HEAD and also verified it passes
pg_upgrade's "make check". I didn't test with other server versions.)

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
pg_dump_deterministic_order_v6.patch text/x-diff 7.9 KB

From: Joachim Wieland <joe(at)mcknight(dot)de>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Joel Jacobson <joel(at)trustly(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] pg_dump: Sort overloaded functions in deterministic order
Date: 2012-10-18 01:08:58
Message-ID: CACw0+12pD2iprrfGtScggeN6ZPbXQBXY_eS6dCH9Z-vQA_JdKg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 17, 2012 at 5:43 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> (I tested the new pg_dump with 8.2 and HEAD and also verified it passes
> pg_upgrade's "make check". I didn't test with other server versions.)

I also tested against 8.3 and 8.4 since 8.4 is the version that
introduced pg_get_function_identity_arguments. The included testcase
fails on 8.3 and succeeds on 8.4 (pg_dump succeeds in both cases of
course but it's only ordered deterministically in 8.4+).


From: Joel Jacobson <joel(at)trustly(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Joachim Wieland <joe(at)mcknight(dot)de>
Subject: Re: [PATCH] pg_dump: Sort overloaded functions in deterministic order
Date: 2012-10-18 10:36:28
Message-ID: CAASwCXcshrZ2oWGdK_-O0vMVs8DnOV75d+-c9iGvv2C5aJepnw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 17, 2012 at 11:43 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Uh, the patch you posted keeps the pg_get_function_identity_arguments
> call in dumpFunc, but there is now also a new one in getFuncs. Do we
> need to remove the second one?

It could be done, but unfortunately we cannot use the value computed
in dumpFunc(),
because getFuncs() is called before dumpFunc().

The patch currently only affects getFuncs(), it doesn't touch dumpFunc().

What could be done is to keep the changes in getFuncs(), and also
change dumpFunc()
to use the value computed in getFuncs(), but I think the gain is small
in relation
to the complexity of changing dumpFunc(), as we would still need to
make the two other
function calls in the SQL query in dumpFunc() to pg_get_function_arguments() and
pg_get_function_result().

> Here's an updated patch for your consideration. I was about to push
> this when I noticed the above. The only change here is that the extra
> code that tests for new remoteVersions in the second "else if" branch of
> getFuncs and getAggregates has been removed, since it cannot ever be
> reached.

Looks really good.


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Joel Jacobson <joel(at)trustly(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Joachim Wieland <joe(at)mcknight(dot)de>
Subject: Re: [PATCH] pg_dump: Sort overloaded functions in deterministic order
Date: 2012-10-18 15:26:33
Message-ID: 20121018152633.GF1982@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Joel Jacobson wrote:
> On Wed, Oct 17, 2012 at 11:43 PM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
> > Uh, the patch you posted keeps the pg_get_function_identity_arguments
> > call in dumpFunc, but there is now also a new one in getFuncs. Do we
> > need to remove the second one?
>
> It could be done, but unfortunately we cannot use the value computed
> in dumpFunc(),
> because getFuncs() is called before dumpFunc().

Right, I got that from the discussion.

> What could be done is to keep the changes in getFuncs(), and also
> change dumpFunc()
> to use the value computed in getFuncs(), but I think the gain is small
> in relation
> to the complexity of changing dumpFunc(), as we would still need to
> make the two other
> function calls in the SQL query in dumpFunc() to pg_get_function_arguments() and
> pg_get_function_result().

Changing pg_dump is complex enough whatever the change, yes. I have not
touched this.

> > Here's an updated patch for your consideration. I was about to push
> > this when I noticed the above. The only change here is that the extra
> > code that tests for new remoteVersions in the second "else if" branch of
> > getFuncs and getAggregates has been removed, since it cannot ever be
> > reached.
>
> Looks really good.

Thanks, pushed it.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services