Re: Patch: show relation and tuple infos of a lock to acquire

From: Christian Kruse <christian(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch: show relation and tuple infos of a lock to acquire
Date: 2014-02-21 11:25:49
Message-ID: 20140221112549.GA27024@defunct.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 19.02.2014 08:11, Amit Kapila wrote:
> I have looked into this patch. Below are some points which I
> think should be improved in this patch:

Thank you for your review.

> 1. New context added by this patch is display at wrong place
> […]
> Do this patch expect to print the context at cancel request
> as well?
> I don't find it useful. I think the reason is that after setup of
> context, if any other error happens, it will display the newly setup
> context.

To be honest, I don't see a problem here. If you cancel the request in
most cases it is because it doesn't finish in an acceptable time. So
displaying the context seems logical to me.

> 2a. About the message:
> LOG: process 1936 still waiting for ShareLock on transaction 842
> after 1014.000ms
> CONTEXT: relation name: foo (OID 57496)
> tuple (ctid (0,1)): (1, 2, abc )
>
> Isn't it better to display database name as well, as if we see in
> one of related messages as below, it displays database name as well.
>
> "LOG: process 3012 still waiting for ExclusiveLock on tuple (0,1) of
> relation 57
> 499 of database 12045 after 1014.000 ms"

Maybe. But then we either have duplicate infos in some cases (e.g. on
a table lock) or we have to distinguish error cases and show distinct
contextes. And also getting the database name from a relation is not
really straight forward (according to Andres we would have to look at
rel->rd_node.dbNode) and since other commands dealing with names don't
do so, either, I think we should leave out the database name.

> 2b. I think there is no need to display *ctid* before tuple offset, it seems
> to be obvious as well as consistent with other similar message.

Ok, I'll drop it.

> 3. About callback I think rather than calling setup/tear down
> functions from multiple places, it will be better to have wrapper
> functions for XactTableLockWait() and MultiXactIdWait(). Just check
> in plpgsql_exec_function(), how the errorcallback is set, can we do
> something similar without to avoid palloc?

OK, Attached.

> 4. About the values of tuple
> I think it might be useful to display some unique part of tuple for
> debugging part, but what will we do incase there is no primary
> key on table, so may be we can display initial few columns (2 or 3)
> or just display tuple offset as is done in other similar message
> shown above.

Currently I simply display the whole tuple with truncating long
fields. This is plain easy, no distinction necessary. I did some code
reading and it seems somewhat complex to get the PK columns and it
seems that we need another lock for this, too – maybe not the best
idea when we are already in locking trouble, do you disagree?

Also showing just a few columns when no PK is available might not be
helpful since the tuple is not necessarily identified by the columns
showed. And since we drop the CTID there is no alternative solution to
identify the tuple.

IMHO simply displaying the whole tuple is the best solution in this
case, do you agree?

> More to the point, I have observed that in few cases
> displaying values of tuple can be confusing as well. For Example:
> […]
> Now here it will be bit confusing to display values with tuple, as
> in session-3 statement has asked to update value (3), but we have
> started waiting for value (4). Although it is right, but might not
> make much sense.

What do you suggest to avoid confusion? I can see what you are talking
about, but I'm not sure what to do about it. Keep in mind that this
patch should help debugging locking, so IMHO it is essential to be
able to identify a tuple and therefore knowing its values.

> Also after session-2 commits the transaction, session-3 will say
> acquired lock, however still it will not be able to update it.

To be honest, I don't think that this is a problem of the
patch. Concurrency is not easy and it might lead to confusing
situations. I don't think that we should drop the tuple values because
of this, since they provide useful and precious debugging information.

Attached you will find a new version of the patch, mainly using
wrapper functions for XactLockTableWait() and MultiXactIdWait().

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
show_table_name_and_tuple_in_lock_log_v4.patch text/x-diff 11.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ronan Dunklau 2014-02-21 11:26:05 Re: Proposal: IMPORT FOREIGN SCHEMA statement.
Previous Message Andres Freund 2014-02-21 11:19:02 Re: Public header files change