Re: New Event Trigger: table_rewrite

Lists: pgsql-hackers
From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: New Event Trigger: table_rewrite
Date: 2014-10-14 20:19:26
Message-ID: m238aqwgj5.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi fellow hackers,

Please find attached to this email a patch to implement a new Event
Trigger, fired on the the "table_rewrite" event. As attached, it's meant
as a discussion enabler and only supports ALTER TABLE (and maybe not in
all forms of it). It will need to grow support for VACUUM FULL and
CLUSTER and more before getting commited.

Also, I'd like to work on the AccessExclusiveLock Event Trigger next,
but wanted this one, more simple, to get acceptance as the way to
approach adding events that are not DDL centric.

This time it's not about which command is running, it's about what the
command is doing.

src/backend/commands/event_trigger.c | 92 ++++++++++++++++++++-
src/backend/commands/tablecmds.c | 35 +++++++-
src/backend/utils/cache/evtcache.c | 2 +
src/include/commands/event_trigger.h | 1 +
src/include/utils/evtcache.h | 3 +-
src/test/regress/expected/event_trigger.out | 18 ++++
src/test/regress/sql/event_trigger.sql | 21 +++++
7 files changed, 166 insertions(+), 6 deletions(-)

Regards,
--
Dimitri Fontaine 06 63 07 10 78
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
table_rewrite.0.patch text/x-patch 10.4 KB

From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New Event Trigger: table_rewrite
Date: 2014-10-16 09:18:52
Message-ID: m2fveobceb.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> writes:
> Please find attached to this email a patch to implement a new Event
> Trigger, fired on the the "table_rewrite" event. As attached, it's meant
> as a discussion enabler and only supports ALTER TABLE (and maybe not in
> all forms of it). It will need to grow support for VACUUM FULL and
> CLUSTER and more before getting commited.

And here's already a new version of it, including support for ALTER
TABLE, VACUUM and CLUSTER commands, and documentation.

Still is a small patch:

doc/src/sgml/event-trigger.sgml | 106 ++++++++++++++++++++
src/backend/commands/cluster.c | 14 ++-
src/backend/commands/event_trigger.c | 106 +++++++++++++++++++-
src/backend/commands/tablecmds.c | 53 ++++++++--
src/backend/commands/vacuum.c | 3 +-
src/backend/utils/cache/evtcache.c | 2 +
src/include/commands/cluster.h | 4 +-
src/include/commands/event_trigger.h | 1 +
src/include/utils/evtcache.h | 3 +-
src/test/regress/expected/event_trigger.out | 23 +++++
src/test/regress/sql/event_trigger.sql | 24 +++++
11 files changed, 322 insertions(+), 17 deletions(-)

--
Dimitri Fontaine 06 63 07 10 78
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
table_rewrite.1.patch text/x-patch 46.2 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New Event Trigger: table_rewrite
Date: 2014-10-28 12:47:22
Message-ID: CA+U5nMJEgkEARGAZTkGvkTGG6soa7-fCtvET4HwtgnTOQPqjsA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 16 October 2014 10:18, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr> wrote:
> Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> writes:
>> Please find attached to this email a patch to implement a new Event
>> Trigger, fired on the the "table_rewrite" event. As attached, it's meant
>> as a discussion enabler and only supports ALTER TABLE (and maybe not in
>> all forms of it). It will need to grow support for VACUUM FULL and
>> CLUSTER and more before getting commited.
>
> And here's already a new version of it, including support for ALTER
> TABLE, VACUUM and CLUSTER commands, and documentation.

The patch itself looks fine overall. Docs look in place, tests OK.

API changes may need more thought. I'm not sure myself, they just look
fairly quick.

It would be more useful to work on the applications of this....

1. INSERT into a table
* Action start time
* Schema
* Tablename
* Number of blocks in table
which would then allow you to do these things run an assessment report
showing which tables would be rewritten.

2. Get access to number of blocks, so you could limit rewrites only to
smaller tables by putting a block limit in place.

3. It might be even cooler to contemplate having pg_stat_activity
publish an estimated end time.
We'd probably need some kind of time_per_block parameter for each
tablespace so we can estimate the time.

Doing 1 and 2 at least would make this a good feature. We can do a
later patch for 3, or similar, once this is accepted.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New Event Trigger: table_rewrite
Date: 2014-11-07 12:35:10
Message-ID: m2vbmrur29.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> It would be more useful to work on the applications of this....
>
> 1. INSERT into a table
> * Action start time
> * Schema
> * Tablename
> * Number of blocks in table
> which would then allow you to do these things run an assessment report
> showing which tables would be rewritten.

That should be done by the user, from within his Event Trigger code. For
that to be possible, the previous patch was missing a way to expose the
OID of the table being rewritten, I've now added support for that.

> 2. Get access to number of blocks, so you could limit rewrites only to
> smaller tables by putting a block limit in place.

Also, I did expand the docs to fully cover your practical use case of a
table_rewrite Event Trigger implementing such a table rewrite policy.

> 3. It might be even cooler to contemplate having pg_stat_activity
> publish an estimated end time.
> We'd probably need some kind of time_per_block parameter for each
> tablespace so we can estimate the time.

That feels like another patch entirely.

Regards,
--
Dimitri Fontaine 06 63 07 10 78
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
table_rewrite.2.patch text/x-patch 58.1 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New Event Trigger: table_rewrite
Date: 2014-11-15 23:57:34
Message-ID: CA+U5nMKRzif4PmavxzR08+Qge9mQTm0cX6M8OPW3+XuUd2xnNg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 7 November 2014 12:35, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr> wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>> It would be more useful to work on the applications of this....
>>
>> 1. INSERT into a table
>> * Action start time
>> * Schema
>> * Tablename
>> * Number of blocks in table
>> which would then allow you to do these things run an assessment report
>> showing which tables would be rewritten.
>
> That should be done by the user, from within his Event Trigger code. For
> that to be possible, the previous patch was missing a way to expose the
> OID of the table being rewritten, I've now added support for that.
>
>> 2. Get access to number of blocks, so you could limit rewrites only to
>> smaller tables by putting a block limit in place.
>
> Also, I did expand the docs to fully cover your practical use case of a
> table_rewrite Event Trigger implementing such a table rewrite policy.

That looks complete, very useful and well documented.

I'm looking to commit this tomorrow.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New Event Trigger: table_rewrite
Date: 2014-11-16 06:59:42
Message-ID: CAB7nPqTqZ2-YcNzOQ5KVBUJYHQ4kDSd4Q55Mc-fBzM8GH0bV2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Nov 16, 2014 at 8:57 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 7 November 2014 12:35, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr> wrote:
>> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>>> It would be more useful to work on the applications of this....
>>>
>>> 1. INSERT into a table
>>> * Action start time
>>> * Schema
>>> * Tablename
>>> * Number of blocks in table
>>> which would then allow you to do these things run an assessment report
>>> showing which tables would be rewritten.
>>
>> That should be done by the user, from within his Event Trigger code. For
>> that to be possible, the previous patch was missing a way to expose the
>> OID of the table being rewritten, I've now added support for that.
>>
>>> 2. Get access to number of blocks, so you could limit rewrites only to
>>> smaller tables by putting a block limit in place.
>>
>> Also, I did expand the docs to fully cover your practical use case of a
>> table_rewrite Event Trigger implementing such a table rewrite policy.
>
> That looks complete, very useful and well documented.
>
> I'm looking to commit this tomorrow.
Patch applies, with many hunks though. Patch and documentation compile
without warnings, passing make check-world.

Some comments:
1) This patch is authorizing VACUUM and CLUSTER to use the event
triggers ddl_command_start and ddl_command_end, but aren't those
commands actually not DDLs but control commands?
2) The documentation of src/sgml/event-trigger.sgml can be improved,
particularly I think that the example function should use a maximum of
upper-case letters for reserved keywords, and also this bit:
you're not allowed to rewrite the table foo
should be rewritten to something like that:
Rewrite of table foo not allowed
3) A typo, missing a plural:
provides two built-in event trigger helper functionS
4) pg_event_trigger_table_rewrite_oid is able to return only one OID,
which is the one of the table being rewritten, and it is limited to
one OID because VACUUM, CLUSTER and ALTER TABLE can only run on one
object at the same time in a single transaction. What about thinking
that we may have in the future multiple objects rewritten in a single
transaction, hence multiple OIDs could be fetched?
5) parsetree is passed to cluster_rel only for
EventTriggerTableRewrite. I am not sure if there are any extension
using cluster_rel as is but wouldn't it make more sense to call
EventTriggerTableRewrite before the calls to cluster_rel instead? ISTM
that this patch is breaking cluster_rel way of doing things.
6) in_table_rewrite seems unnecessary.
typedef struct EventTriggerQueryState
{
slist_head SQLDropList;
bool in_sql_drop;
+ bool in_table_rewrite;
+ Oid tableOid;
We could simplify that by renaming tableOid to rewriteTableOid or
rewriteObjOid and check if its value is InvalidOid to determine if the
event table_rewrite is in use or not. Each code path setting those
variables sets them all the time similarly:
+ state->in_table_rewrite = false;
+ state->tableOid = InvalidOid;
And if tableOid is InvaliOid, in_table_rewrite is false. If it is a
valid Oid, in_table_rewrite is set to true.
7) table_rewrite is kicked in ALTER TABLE only when ATRewriteTables is
used. The list of commands that actually go through this code path
should be clarified in the documentation IMO to help the user
apprehend this function.

Note that this patch has been submitted but there have been no real
discussion around it.. This seems a bit too fast to commit it, no?
Regards,
--
Michael


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New Event Trigger: table_rewrite
Date: 2014-11-16 10:32:39
Message-ID: CA+U5nMKjZCGMbBJLzH5rk-AA3FKvtZSHGXtJ4q-6x3WdO9Bdag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 16 November 2014 06:59, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:

> Note that this patch has been submitted but there have been no real
> discussion around it.. This seems a bit too fast to commit it, no?

Committing uncontentious patches at the end of the commitfest seems normal, no?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New Event Trigger: table_rewrite
Date: 2014-11-16 10:51:06
Message-ID: CA+U5nMKh6JDha-unwE6eay1JTa1Wqwwn88tU-=VYcVfe6Fj2AA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 16 November 2014 06:59, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:

> 1) This patch is authorizing VACUUM and CLUSTER to use the event
> triggers ddl_command_start and ddl_command_end, but aren't those
> commands actually not DDLs but control commands?

I could go either way on that. I'm happy to remove those from this commit.

> 2) The documentation of src/sgml/event-trigger.sgml can be improved,
> particularly I think that the example function should use a maximum of
> upper-case letters for reserved keywords, and also this bit:
> you're not allowed to rewrite the table foo
> should be rewritten to something like that:
> Rewrite of table foo not allowed
> 3) A typo, missing a plural:
> provides two built-in event trigger helper functionS

I thought the documentation was very good, in comparison to most other
feature submissions. Given that this is one of the areas I moan about
a lot, that says something.

> 4) pg_event_trigger_table_rewrite_oid is able to return only one OID,
> which is the one of the table being rewritten, and it is limited to
> one OID because VACUUM, CLUSTER and ALTER TABLE can only run on one
> object at the same time in a single transaction. What about thinking
> that we may have in the future multiple objects rewritten in a single
> transaction, hence multiple OIDs could be fetched?

Why would this API support something which the normal trigger API
doesn't, just in case we support a feature that hadn't ever been
proposed or discussed? Why can't such a change wait until that feature
arrives?

> 5) parsetree is passed to cluster_rel only for
> EventTriggerTableRewrite. I am not sure if there are any extension
> using cluster_rel as is but wouldn't it make more sense to call
> EventTriggerTableRewrite before the calls to cluster_rel instead? ISTM
> that this patch is breaking cluster_rel way of doing things.

I will remove the call to CLUSTER and VACUUM as proposed above.

> 6) in_table_rewrite seems unnecessary.
> typedef struct EventTriggerQueryState
> {
> slist_head SQLDropList;
> bool in_sql_drop;
> + bool in_table_rewrite;
> + Oid tableOid;
> We could simplify that by renaming tableOid to rewriteTableOid or
> rewriteObjOid and check if its value is InvalidOid to determine if the
> event table_rewrite is in use or not. Each code path setting those
> variables sets them all the time similarly:
> + state->in_table_rewrite = false;
> + state->tableOid = InvalidOid;
> And if tableOid is InvaliOid, in_table_rewrite is false. If it is a
> valid Oid, in_table_rewrite is set to true.

Well, that seems a minor change. I'm happy to accept the original
coding, but also happy to receive suggested changes.

> 7) table_rewrite is kicked in ALTER TABLE only when ATRewriteTables is
> used. The list of commands that actually go through this code path
> should be clarified in the documentation IMO to help the user
> apprehend this function.

That is somewhat orthogonal to the patch. The rules for rewriting are
quite complex, which is why this is needed and why documentation isn't
really the answer. Separate doc patch on that would be welcome.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New Event Trigger: table_rewrite
Date: 2014-11-17 18:24:46
Message-ID: CA+TgmobO+XhOWjMKrSZFh57jcvUcbeQR_q8zVBSVEZW70fHVSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Nov 16, 2014 at 5:51 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 16 November 2014 06:59, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:
>> 1) This patch is authorizing VACUUM and CLUSTER to use the event
>> triggers ddl_command_start and ddl_command_end, but aren't those
>> commands actually not DDLs but control commands?
>
> I could go either way on that. I'm happy to remove those from this commit.

Yeah, this patch definitely shouldn't change the set of commands to
which existing event triggers apply as a side-effect. There's no
reason new DDL commands need to apply to the same set of operations as
existing DDL commands, but the existing ones shouldn't be changed
without specific discussion and agreement.

It seems pretty weird, also, that the event trigger will fire after
we've taken AccessExclusiveLock when you cluster a particular
relation, and before we've taken AccessExclusiveLock when you cluster
database-wide. That's more or less an implementation artifact of the
current code that we're exposing to the use for, really, no good
reason.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New Event Trigger: table_rewrite
Date: 2014-11-18 22:14:55
Message-ID: m2389gp34w.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> 1) This patch is authorizing VACUUM and CLUSTER to use the event
> triggers ddl_command_start and ddl_command_end, but aren't those
> commands actually not DDLs but control commands?

Reverted in the attached version 3 of the patch.

> 6) in_table_rewrite seems unnecessary.

Removed in the attached version 3 of the patch.

On Sun, Nov 16, 2014 at 5:51 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> 4) pg_event_trigger_table_rewrite_oid is able to return only one OID,
>> which is the one of the table being rewritten, and it is limited to
>> one OID because VACUUM, CLUSTER and ALTER TABLE can only run on one
>> object at the same time in a single transaction. What about thinking
>> that we may have in the future multiple objects rewritten in a single
>> transaction, hence multiple OIDs could be fetched?
>
> Why would this API support something which the normal trigger API
> doesn't, just in case we support a feature that hadn't ever been
> proposed or discussed? Why can't such a change wait until that feature
> arrives?

Agreed, unchanged in the attached.

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> It seems pretty weird, also, that the event trigger will fire after
> we've taken AccessExclusiveLock when you cluster a particular
> relation, and before we've taken AccessExclusiveLock when you cluster
> database-wide. That's more or less an implementation artifact of the
> current code that we're exposing to the use for, really, no good
> reason.

In the CLUSTER implementation we have only one call site for invoking
the Event Trigger, in cluster_rel(). While it's true that in the single
relation case, the relation is opened in cluster() then cluster_rel() is
called, the opening is done with NoLock in cluster():

rel = heap_open(tableOid, NoLock);

My understanding is that the relation locking only happens in
cluster_rel() at this line:

OldHeap = try_relation_open(tableOid, AccessExclusiveLock);

Please help me through the cluster locking strategy here, I feel like
I'm missing something obvious, as my conclusion from re-reading the code
in lights of your comment is that your comment is not accurate with
respect to the current state of the code.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
table_rewrite.3.patch text/x-patch 60.2 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New Event Trigger: table_rewrite
Date: 2014-11-18 22:34:02
Message-ID: 20141118223402.GF1948@alvin.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dimitri Fontaine wrote:

> In the CLUSTER implementation we have only one call site for invoking
> the Event Trigger, in cluster_rel(). While it's true that in the single
> relation case, the relation is opened in cluster() then cluster_rel() is
> called, the opening is done with NoLock in cluster():
>
> rel = heap_open(tableOid, NoLock);
>
> My understanding is that the relation locking only happens in
> cluster_rel() at this line:
>
> OldHeap = try_relation_open(tableOid, AccessExclusiveLock);
>
> Please help me through the cluster locking strategy here, I feel like
> I'm missing something obvious, as my conclusion from re-reading the code
> in lights of your comment is that your comment is not accurate with
> respect to the current state of the code.

Almost the whole of that function is conditions to bail out clustering
the relation if things have changed since the relation list was
collected. It seems wrong to invoke the event trigger in all those
cases; it's going to fire spuriously. I think you should move the
invocation of the event trigger at the end, just before rebuild_relation
is called. Not sure where relative to the predicate lock stuff therein;
probably before, so that we avoid doing that dance if the event trigger
function decides to jump ship.

In ATRewriteTables, it seems wrong to call it after make_new_heap. If
the event trigger function aborts, we end up with useless work done
there; so I think it should be called before that. Also, why do you
have the evt_table_rewrite_fired stuff? I think you should fire one
event per table, no?

The second ATRewriteTable call in ATRewriteTables does not actually
rewrite the table; it only scans it to verify constraints. So I'm
thinking you shouldn't call this event trigger there. Or, if we decide
we want this, we probably also need something for the table scans in
ALTER DOMAIN too.

You still have the ANALYZE thing in docs, which now should be removed.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New Event Trigger: table_rewrite
Date: 2014-11-19 17:57:21
Message-ID: m2vbmbkr9a.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Almost the whole of that function is conditions to bail out clustering
> the relation if things have changed since the relation list was
> collected. It seems wrong to invoke the event trigger in all those
> cases; it's going to fire spuriously. I think you should move the
> invocation of the event trigger at the end, just before rebuild_relation
> is called. Not sure where relative to the predicate lock stuff therein;
> probably before, so that we avoid doing that dance if the event trigger
> function decides to jump ship.

Actually when you do a CLUSTER or a VACUUM FULL you know that the
table is going to be rewritten on disk, because that's about the only
purpose of the command.

Given the complexity involved here, the new version of the patch
(attached) has removed support for those statements.

> In ATRewriteTables, it seems wrong to call it after make_new_heap. If
> the event trigger function aborts, we end up with useless work done
> there; so I think it should be called before that. Also, why do you
> have the evt_table_rewrite_fired stuff? I think you should fire one
> event per table, no?

Fixed in the attached version of the patch.

> The second ATRewriteTable call in ATRewriteTables does not actually
> rewrite the table; it only scans it to verify constraints. So I'm
> thinking you shouldn't call this event trigger there. Or, if we decide
> we want this, we probably also need something for the table scans in
> ALTER DOMAIN too.

Fixed in the attached version of the patch.

> You still have the ANALYZE thing in docs, which now should be removed.

Fixed in the attached version of the patch.

--
Dimitri Fontaine 06 63 07 10 78
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
table_rewrite.4.patch text/x-patch 53.1 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New Event Trigger: table_rewrite
Date: 2014-11-19 18:01:25
Message-ID: CA+TgmoZhcSspPpr-ryUOPB69Nm0iwp_-GbT2=95pyR0buLKZ_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 18, 2014 at 5:14 PM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> It seems pretty weird, also, that the event trigger will fire after
>> we've taken AccessExclusiveLock when you cluster a particular
>> relation, and before we've taken AccessExclusiveLock when you cluster
>> database-wide. That's more or less an implementation artifact of the
>> current code that we're exposing to the use for, really, no good
>> reason.
>
> In the CLUSTER implementation we have only one call site for invoking
> the Event Trigger, in cluster_rel(). While it's true that in the single
> relation case, the relation is opened in cluster() then cluster_rel() is
> called, the opening is done with NoLock in cluster():
>
> rel = heap_open(tableOid, NoLock);
>
> My understanding is that the relation locking only happens in
> cluster_rel() at this line:
>
> OldHeap = try_relation_open(tableOid, AccessExclusiveLock);
>
> Please help me through the cluster locking strategy here, I feel like
> I'm missing something obvious, as my conclusion from re-reading the code
> in lights of your comment is that your comment is not accurate with
> respect to the current state of the code.

Unless I'm missing something, when you cluster a particular relation,
cluster() does this:

/* Find and lock the table */
rel = heap_openrv(stmt->relation, AccessExclusiveLock);

I don't see the "rel = heap_open(tableOid, NoLock);" line you quoted anywhere.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New Event Trigger: table_rewrite
Date: 2014-11-19 18:02:51
Message-ID: CA+TgmoZ7HxOdR4_VKYvQvD1bJO1UXw+XaDVhr=x6fZjZ1zM6GA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 18, 2014 at 5:34 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Almost the whole of that function is conditions to bail out clustering
> the relation if things have changed since the relation list was
> collected. It seems wrong to invoke the event trigger in all those
> cases; it's going to fire spuriously. I think you should move the
> invocation of the event trigger at the end, just before rebuild_relation
> is called. Not sure where relative to the predicate lock stuff therein;
> probably before, so that we avoid doing that dance if the event trigger
> function decides to jump ship.

I can see two problems with that:

1. What if the conditions aren't true any more after the event trigger
has run? Then it's unsafe.

2. If we do it that way, then we'll unnecessarily wait for a lock on
the relation even if the event trigger is just going to bail out.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New Event Trigger: table_rewrite
Date: 2014-11-19 18:06:39
Message-ID: CA+Tgmoa5sotcxVyVR_iq1gR_z9D3LnHcqgU9BXHq=j-Br-nYdA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 19, 2014 at 1:01 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Nov 18, 2014 at 5:14 PM, Dimitri Fontaine
> <dimitri(at)2ndquadrant(dot)fr> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> It seems pretty weird, also, that the event trigger will fire after
>>> we've taken AccessExclusiveLock when you cluster a particular
>>> relation, and before we've taken AccessExclusiveLock when you cluster
>>> database-wide. That's more or less an implementation artifact of the
>>> current code that we're exposing to the use for, really, no good
>>> reason.
>>
>> In the CLUSTER implementation we have only one call site for invoking
>> the Event Trigger, in cluster_rel(). While it's true that in the single
>> relation case, the relation is opened in cluster() then cluster_rel() is
>> called, the opening is done with NoLock in cluster():
>>
>> rel = heap_open(tableOid, NoLock);
>>
>> My understanding is that the relation locking only happens in
>> cluster_rel() at this line:
>>
>> OldHeap = try_relation_open(tableOid, AccessExclusiveLock);
>>
>> Please help me through the cluster locking strategy here, I feel like
>> I'm missing something obvious, as my conclusion from re-reading the code
>> in lights of your comment is that your comment is not accurate with
>> respect to the current state of the code.
>
> Unless I'm missing something, when you cluster a particular relation,
> cluster() does this:
>
> /* Find and lock the table */
> rel = heap_openrv(stmt->relation, AccessExclusiveLock);
>
> I don't see the "rel = heap_open(tableOid, NoLock);" line you quoted anywhere.

...which is because I have the 9.1 branch checked out. Genius. But
what I said originally is still true, because the current code looks
like this:

/* Find, lock, and check permissions on the table */
tableOid = RangeVarGetRelidExtended(stmt->relation,

AccessExclusiveLock,

false, false,

RangeVarCallbackOwnsTable, NULL);
rel = heap_open(tableOid, NoLock);

It's true that the heap_open() is not acquiring any lock. But the
RangeVarGetRelidExtended() call right before it is.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New Event Trigger: table_rewrite
Date: 2014-11-20 01:25:30
Message-ID: CAB7nPqQMtxV3PLBRo3aTH3-eV1=UYsbPtKBjOO6aDAvpPXULAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 20, 2014 at 2:57 AM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Fixed in the attached version of the patch.
Thanks! Things are moving nicely for this patch. Patch compiles and
passes check-world. Some minor comments about the latest version:
1) Couldn't this paragraph be reworked?
<para>
+ The <literal>table_rewrite</> event occurs just before a table is going to
+ get rewritten by the commands <literal>ALTER TABLE</literal>. While other
+ control statements are available to rewrite a table,
+ like <literal>CLUSTER</literal> and <literal>VACUUM</literal>,
+ the <literal>table_rewrite</> event is currently only triggered by
+ the <literal>ALTER TABLE</literal> command, which might or might not need
+ to rewrite the table.
+ </para>
CLUSTER and VACUUM are not part of the supported commands anymore, so
I think that we could replace that by the addition of a reference
number in the cell of ALTER TABLE for the event table_rewrite and
write at the bottom of the table a description of how this event
behaves with ALTER TABLE. Note as well that "might or might not" is
not really helpful for the user.
2) The examples of SQL queries provided are still in lower case in the
docs, that's contrary to the rest of the docs where upper case is used
for reserved keywords.
+ <para>
+ Here's an example implementing such a policy.
+<programlisting>
+create or replace function no_rewrite()
+ returns event_trigger
+ language plpgsql as
[...]
3) This reference can be completely removed:
/*
* Otherwise, command should be CREATE, ALTER, or DROP.
+ * Or one of ANALYZE, CLUSTER, VACUUM.
*/
4) In those places as well CLUSTER and VACUUM should be removed:
+ else if (pg_strncasecmp(tag, "ANALYZE", 7) == 0 ||
+ pg_strncasecmp(tag, "CLUSTER", 7) == 0 ||
+ pg_strncasecmp(tag, "VACUUM", 6) == 0)
+ return EVENT_TRIGGER_COMMAND_TAG_OK;
else
And here:
+ if (pg_strcasecmp(tag, "ALTER TABLE") == 0 ||
+ pg_strcasecmp(tag, "CLUSTER") == 0 ||
+ pg_strcasecmp(tag, "VACUUM") == 0 ||
+ pg_strcasecmp(tag, "ANALYZE") == 0 )
+ return EVENT_TRIGGER_COMMAND_TAG_OK
I am noticing that the points raised by Alvaro previously are fixed.
Regards,
--
Michael


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New Event Trigger: table_rewrite
Date: 2014-11-20 02:17:30
Message-ID: 20141120021730.GH1639@alvin.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Michael Paquier wrote:

> 1) Couldn't this paragraph be reworked?
> <para>
> + The <literal>table_rewrite</> event occurs just before a table is going to
> + get rewritten by the commands <literal>ALTER TABLE</literal>. While other
> + control statements are available to rewrite a table,
> + like <literal>CLUSTER</literal> and <literal>VACUUM</literal>,
> + the <literal>table_rewrite</> event is currently only triggered by
> + the <literal>ALTER TABLE</literal> command, which might or might not need
> + to rewrite the table.
> + </para>
> CLUSTER and VACUUM are not part of the supported commands anymore, so
> I think that we could replace that by the addition of a reference
> number in the cell of ALTER TABLE for the event table_rewrite and
> write at the bottom of the table a description of how this event
> behaves with ALTER TABLE. Note as well that "might or might not" is
> not really helpful for the user.

That's precisely why we have an event trigger here, I think --- for some
subcommands, it's not easy to determine whether a rewrite happens or
not. (I think SET TYPE is the one). I don't think we want to document
precisely under what condition a rewrite takes place.

> 2) The examples of SQL queries provided are still in lower case in the
> docs, that's contrary to the rest of the docs where upper case is used
> for reserved keywords.
> + <para>
> + Here's an example implementing such a policy.
> +<programlisting>
> +create or replace function no_rewrite()
> + returns event_trigger
> + language plpgsql as

Yes please. <nitpick> Another thing in that sample code is "not current_hour
between 1 and 6". That reads strange to me. It should be equally
correct to spell it as "current_hour not between 1 and 6", which seems
more natural. </>

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New Event Trigger: table_rewrite
Date: 2014-11-20 13:37:43
Message-ID: m2y4r6gfh4.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>> CLUSTER and VACUUM are not part of the supported commands anymore, so
>> I think that we could replace that by the addition of a reference
>> number in the cell of ALTER TABLE for the event table_rewrite and
>> write at the bottom of the table a description of how this event
>> behaves with ALTER TABLE. Note as well that "might or might not" is
>> not really helpful for the user.
>
> That's precisely why we have an event trigger here, I think --- for some
> subcommands, it's not easy to determine whether a rewrite happens or
> not. (I think SET TYPE is the one). I don't think we want to document
> precisely under what condition a rewrite takes place.

Yeah, the current documentation expands to the following sentence, as
browsed in

http://www.postgresql.org/docs/9.3/interactive/sql-altertable.html

As an exception, if the USING clause does not change the column
contents and the old type is either binary coercible to the new type
or an unconstrained domain over the new type, a table rewrite is not
needed, but any indexes on the affected columns must still be rebuilt.

I don't think that “might or might not” is less helpful in the context
of the Event Trigger, because the whole point is that the event is only
triggered in case of a rewrite. Of course we could cross link the two
paragraphs or something.

>> 2) The examples of SQL queries provided are still in lower case in the
>> docs, that's contrary to the rest of the docs where upper case is used
>> for reserved keywords.

Right, being consistent trumps personal preferences, changed in the
attached.

> Yes please. <nitpick> Another thing in that sample code is "not current_hour
> between 1 and 6". That reads strange to me. It should be equally
> correct to spell it as "current_hour not between 1 and 6", which seems
> more natural. </>

True, fixed in the attached.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
table_rewrite.5.patch text/x-patch 53.1 KB

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New Event Trigger: table_rewrite
Date: 2014-12-02 07:22:28
Message-ID: CAB7nPqSjKWf2u4yJRAK29cX0Qr_HwZEz3DNjptioi0apXMvCZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 20, 2014 at 10:37 PM, Dimitri Fontaine
<dimitri(at)2ndquadrant(dot)fr> wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>>> CLUSTER and VACUUM are not part of the supported commands anymore, so
>>> I think that we could replace that by the addition of a reference
>>> number in the cell of ALTER TABLE for the event table_rewrite and
>>> write at the bottom of the table a description of how this event
>>> behaves with ALTER TABLE. Note as well that "might or might not" is
>>> not really helpful for the user.
>>
>> That's precisely why we have an event trigger here, I think --- for some
>> subcommands, it's not easy to determine whether a rewrite happens or
>> not. (I think SET TYPE is the one). I don't think we want to document
>> precisely under what condition a rewrite takes place.
>
> Yeah, the current documentation expands to the following sentence, as
> browsed in
>
> http://www.postgresql.org/docs/9.3/interactive/sql-altertable.html
>
> As an exception, if the USING clause does not change the column
> contents and the old type is either binary coercible to the new type
> or an unconstrained domain over the new type, a table rewrite is not
> needed, but any indexes on the affected columns must still be rebuilt.
>
> I don't think that "might or might not" is less helpful in the context
> of the Event Trigger, because the whole point is that the event is only
> triggered in case of a rewrite. Of course we could cross link the two
> paragraphs or something.
>
>>> 2) The examples of SQL queries provided are still in lower case in the
>>> docs, that's contrary to the rest of the docs where upper case is used
>>> for reserved keywords.
>
> Right, being consistent trumps personal preferences, changed in the
> attached.
>
>> Yes please. <nitpick> Another thing in that sample code is "not current_hour
>> between 1 and 6". That reads strange to me. It should be equally
>> correct to spell it as "current_hour not between 1 and 6", which seems
>> more natural. </>
>
> True, fixed in the attached.
The status of this patch was not updated on the commit fest app, so I
lost track of it. Sorry for not answering earlier btw.

The following things to note about v5:
1) There are still mentions of VACUUM, ANALYZE and CLUSTER:
@@ -264,6 +275,10 @@ check_ddl_tag(const char *tag)
obtypename = tag + 6;
else if (pg_strncasecmp(tag, "DROP ", 5) == 0)
obtypename = tag + 5;
+ else if (pg_strncasecmp(tag, "ANALYZE", 7) == 0 ||
+ pg_strncasecmp(tag, "CLUSTER", 7) == 0 ||
+ pg_strncasecmp(tag, "VACUUM", 6) == 0)
+ return EVENT_TRIGGER_COMMAND_TAG_OK;
2) There are a couple of typos and incorrect styling, like "if(". Nothing huge..
Cleanup is done in the attached.

In any case, all the issues mentioned seem to have been addressed, so
switching this patch to ready for committer.
Regards,
--
Michael

Attachment Content-Type Size
20141202_table_rewrite_v6.patch text/x-diff 18.2 KB