Re: ProcessStandbyHSFeedbackMessage can make global xmin go backwards

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: ProcessStandbyHSFeedbackMessage can make global xmin go backwards
Date: 2011-10-21 15:50:53
Message-ID: CAHyXU0zOJra9NCDGOzYy+kRiT5SPCiaJ6PHaHDqCiQ1Y5NdTng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 19, 2011 at 6:04 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> ProcessStandbyHSFeedbackMessage has a race condition: it thinks it can
> call GetOldestXmin and then the result will politely hold still while
> it considers what to do next.  But in fact, whoever has the oldest xmin
> could exit their transaction, allowing the global minimum to advance.
> If a VACUUM process then inspects the ProcArray, it could compute an
> oldest xmin that is newer than the value that
> ProcessStandbyHSFeedbackMessage installs just afterwards.  So much
> for keeping the data the standby wanted.
>
> AFAICS we have to do all the logic about choosing the new value for
> MyProc->xmin while holding ProcArrayLock, which IMO means that it should
> go into a function in procarray.c.  The fact that walsender.c is taking
> ProcArrayLock and whacking MyProc->xmin around is already a horrid
> violation of modularity, even if it weren't incorrect.
>
> Also, it seems like using GetOldestXmin is quite wrong here anyway; we
> don't really want a result modified by vacuum_defer_cleanup_age do we?
> It looks to me like that would result in vacuum_defer_cleanup_age being
> applied twice: the walsender process sets its xmin the defer age into
> the past, and then subsequent calls of GetOldestXmin compute a result
> that is the defer age further back.  IOW this is an independent
> mechanism that also results in the computed global xmin going backwards.
>
> (Now that we have a standby feedback mechanism, I'm a bit tempted to
> propose getting rid of vacuum_defer_cleanup_age altogether, rather than
> hacking things to avoid the above.)

curious: are these bugs in production, and what would be the user
visible symptoms of seeing them in the wild?

merlin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Marti Raudsepp 2011-10-21 15:57:47 Re: pg_comments (was: Allow \dd to show constraint comments)
Previous Message Marti Raudsepp 2011-10-21 15:45:52 Re: [PATCH] Log crashed backend's query v3