Re: Cache invalidation bug in RelationGetIndexAttrBitmap()

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Cache invalidation bug in RelationGetIndexAttrBitmap()
Date: 2014-05-14 20:29:15
Message-ID: 20140514202915.GJ23943@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2014-05-14 21:04:41 +0200, Tomas Vondra wrote:
> On 14.5.2014 17:52, Andres Freund wrote:
> > On 2014-05-14 15:17:39 +0200, Andres Freund wrote:
> >> On 2014-05-14 15:08:08 +0200, Tomas Vondra wrote:
> >>> Apparently there's something wrong with 'test-decoding-check':
> >>
> >> Man. I shouldn't have asked... My code. There's some output in there
> >> that's probably triggered by the extraordinarily long runtimes, but
> >> there's definitely something else wrong.
> >> My gut feeling says it's in RelationGetIndexList().
> >
> > Nearly right. It's in RelationGetIndexAttrBitmap(). Fix attached.
> >
> > Tomas, thanks for that. I've never (and probably will never) run
> > CLOBBER_CACHE_RECURSIVELY during development. Having a machine do that
> > regularly is really helpful. How long does a single testrun take? It
> > takes hundreds of seconds here to do a single UPDATE?
>
> Don't know yet, as it fails at the beginning.

test decoding is at the beginning? That's somewhat odd?

> But I suppose it will be
> tens or possibly hundreds of hours. For example these are the logs from
> regular build (no clobber etc.)

> May 14 19:00 SCM-checkout.log
> May 14 19:00 githead.log
> May 14 19:00 configure.log
> May 14 19:00 config.log
> May 14 19:05 make.log
> May 14 19:05 check.log
> May 14 19:06 make-contrib.log
> May 14 19:06 make-install.log
> May 14 19:06 install-contrib.log
> May 14 19:07 check-pg_upgrade.log
> May 14 19:08 test-decoding-check.log
>
> while these are the logs from recursive clobber:
>
> May 14 00:19 SCM-checkout.log
> May 14 00:20 configure.log
> May 14 00:20 config.log
> May 14 00:26 make.log
> May 14 03:12 check.log
> May 14 03:13 make-contrib.log
> May 14 03:13 make-install.log
> May 14 03:13 install-contrib.log
> May 14 08:25 check-pg_upgrade.log
> May 14 09:07 test-decoding-check.log
> May 14 09:07 web-txn.data
>
>
> So with the regular build, it took <1 minute to do 'make check' and ~1
> minute to test pg_upgrade, with recursive clobber it takes ~3 hours and
> ~5 hours. That's a factor of ~300, although it's a very rough
> estimate.

I seriously doubt that's recursive clobber. That should take *way* much
longer. And indeed you have:

> -DCLOBBER_CACHE_ALWAYS -DCLOBBER_FREED_MEMORY -DMEMORY_CONTEXT_CHECKING
> -DRANDOMIZE_ALLOCATED_MEMORY -DCLOBBER_CACHE_RECURSIVELY
>
> it does not happen with
>
> CPPFLAGS => '-DCLOBBER_CACHE_ALWAYS -DCLOBBER_FREED_MEMORY
> -DMEMORY_CONTEXT_CHECKING -DRANDOMIZE_ALLOCATED_MEMORY',

#if defined(CLOBBER_CACHE_ALWAYS)
{
static bool in_recursion = false;

if (!in_recursion)
{
in_recursion = true;
InvalidateSystemCaches();
in_recursion = false;
}
}
#elif defined(CLOBBER_CACHE_RECURSIVELY)
InvalidateSystemCaches();
#endif

i.e. you can't specifiy -DCLOBBER_CACHE_ALWAYS and
-DCLOBBER_CACHE_RECURSIVELY together. The former will take precedence.

> Without clobber the whole run (for a "C" locale) takes ~10 minutes, so
> my estimate is ~50 hours for the recursive one. But I wouldn't be
> surprised by 100 hours.

I'm afraid it's more in the year range from what i've seen. I.e. not
practical.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2014-05-14 21:02:50 Re: Cache invalidation bug in RelationGetIndexAttrBitmap()
Previous Message Thomas Munro 2014-05-14 20:29:12 Re: SKIP LOCKED DATA (work in progress)