Re: Inaccuracy in VACUUM's tuple count estimates

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inaccuracy in VACUUM's tuple count estimates
Date: 2014-06-09 18:24:22
Message-ID: 1402338262.12047.YahooMailNeo@web122302.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-06-09 09:45:12 -0700, Kevin Grittner wrote:

> I am not sure, given predicate.c's coding, how
> HEAPTUPLE_DELETE_IN_PROGRESS could cause problems. Could you elaborate,
> since that's the contentious point with Tom? Since 'both in
> progress'
> can only happen if xmin and xmax are the same toplevel xid and you
> resolve subxids to toplevel xids I think it should currently be safe
> either way?

The only way that it could be a problem is if the DELETE is in a
subtransaction which might get rolled back without rolling back the
INSERT.  If we ignore the conflict because we assume the INSERT
will be negated by the DELETE, and that doesn't happen, we would
get false negatives which would compromise correctness.  If we
assume that the DELETE might not happen when the DELETE is not in a
separate subtransaction we might get a false positive, which would
only be a performance hit.  If we know either is possible and have
a way to check in predicate.c, it's fine to check it there.

>>>     HEAPTUPLE_RECENTLY_DEAD,    /* tuple is dead, but not deletable yet */
>>> 1) xmin has committed, xmax has committed and wasn't only a locker. But
>>> xmax doesn't precede OldestXmin.
>>
>>  For my purposes, it would be better if this also included:
>>   2) xmin is in progress, xmax matches (or includes) xmin
>>
>>  ... but that would be only a performance tweak.
>
> I don't see that happening as there's several callers for which it is
> important to know whether the xacts are still alive or not.

OK

>>>     HEAPTUPLE_DELETE_IN_PROGRESS    /* deleting xact is still in progress */
>>> new:
>>> 1) xmin has committed, xmax is in progress, xmax is not just a locker
>>> 2) xmin is in progress, xmin is the current backend, xmax is not just a
>>>   locker and in progress.
>>
>>  I'm not clear on how 2) could happen unless xmax is the current
>>  backend or a subtransaction thereof.  Could you clarify?
>>
>>> old:
>>> 1) xmin has committed, xmax is in progress, xmax is not just a locker
>>> 2) xmin is in progress, xmax is set and not not just a locker
>>>
>>> Note that the 2) case here never checked xmax's status.
>>
>>  Again, I'm not sure how 2) could happen unless they involve the
>>  same top-level transaction.  What am I missing?
>
> Right, both can only happen if the tuple is created & deleted in the
> same backend. Is that in contradiction to something you see?

Well, you're making a big point that the status of xmax was not
checked in the old code.  If xmax is the same as xmin and xmin is
in progress, the additional check seems redundant -- unless I'm
missing something.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Linos 2014-06-09 18:36:49 Re: performance regression in 9.2/9.3
Previous Message Noah Misch 2014-06-09 18:14:33 Re: updated emacs configuration