Re: mvcc catalo gsnapshots and TopTransactionContext

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: mvcc catalo gsnapshots and TopTransactionContext
Date: 2013-08-06 03:12:14
Message-ID: 20130806031214.GC31572@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-08-05 13:09:31 -0400, Noah Misch wrote:
> On Fri, Jul 12, 2013 at 11:42:23AM +0200, Andres Freund wrote:
> > On 2013-07-11 15:09:45 -0400, Tom Lane wrote:
> > > It never has been, and never will be, allowed to call the catcache code
> > > without being in a transaction. What do you think will happen if the
> > > requested row isn't in cache? A table access, that's what, and that
> > > absolutely requires being in a transaction.
>
> > I'd like to add an Assert like in the attached patch making sure we're
> > in a transaction. Otherwise it's far too easy not to hit an error during
> > development because everything is cached - and syscache usage isn't
> > always obvious from the outside to the naive or the tired.
>
> The following test case reliably hits this new assertion:
>
> create table t (c int);
> begin;
> create index on t (c);
> savepoint q;
> insert into t values (1);
> select 1/0;
>
> Stack trace:
>
> #2 0x0000000000761159 in ExceptionalCondition (conditionName=<value optimized out>, errorType=<value optimized out>, fileName=<value optimized out>, lineNumber=<value optimized out>) at assert.c:54
> #3 0x000000000074fc53 in SearchCatCache (cache=0xc53048, v1=139479, v2=0, v3=0, v4=0) at catcache.c:1072
> #4 0x000000000075c011 in SearchSysCache (cacheId=0, key1=369539, key2=6, key3=18446744073709551615, key4=0) at syscache.c:909
> #5 0x0000000000757e57 in RelationReloadIndexInfo (relation=0x7f2dd5fffea0) at relcache.c:1770
> #6 0x000000000075817b in RelationClearRelation (relation=0x7f2dd5fffea0, rebuild=1 '\001') at relcache.c:1926
> #7 0x00000000007588c6 in RelationFlushRelation (relationId=139479) at relcache.c:2076
> #8 RelationCacheInvalidateEntry (relationId=139479) at relcache.c:2138
> #9 0x000000000075185d in LocalExecuteInvalidationMessage (msg=0xcde778) at inval.c:546
> #10 0x0000000000751185 in ProcessInvalidationMessages (hdr=0xc0d390, func=0x7517d0 <LocalExecuteInvalidationMessage>) at inval.c:422
> #11 0x0000000000751a3f in AtEOSubXact_Inval (isCommit=0 '\000') at inval.c:973
> #12 0x00000000004cb986 in AbortSubTransaction () at xact.c:4250
> #13 0x00000000004cbf05 in AbortCurrentTransaction () at xact.c:2863
> #14 0x00000000006968f6 in PostgresMain (argc=1, argv=0x7fff7cf71610, dbname=0xc13b60 "test", username=0xbedc88 "nm") at postgres.c:3848
> #15 0x000000000064c2b5 in BackendRun () at postmaster.c:4044
> #16 BackendStartup () at postmaster.c:3733
> #17 ServerLoop () at postmaster.c:1571
> #18 0x000000000064fa8d in PostmasterMain (argc=1, argv=0xbe84a0) at postmaster.c:1227
> #19 0x00000000005e2dcd in main (argc=1, argv=0xbe84a0) at main.c:196
>
> When we call AtEOSubXact_Inval() or AtEOXact_Inval() with a relation still
> open, we can potentially get a relcache rebuild and therefore a syscache
> lookup as shown above. CommitSubTransaction() is also potentially affected,
> though I don't have an SQL-level test case for that. It calls
> CommandCounterIncrement() after moving to TRANS_COMMIT. That CCI had better
> find no invalidations of open relations, or we'll make syscache lookups. (In
> simple tests, any necessary invalidations tend to happen at the CCI in
> CommitTransactionCommand(), so the later CCI does in fact find nothing to do.
> I have little confidence that should be counted upon, though.)
>
> How might we best rearrange things to avoid these hazards?

Man. This is a nasty one. And seems to have been there for a long
time. I am rather surprised that we haven't seen this as a cause of
problems.
Deferring invalidation processing a good bit seems to be the only way to
deal with this I can see but that seems to require an uncomfortable
amount of shuffling around in xact.c.

I'll think about it more, but so far I haven't seen anything close to
nice.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2013-08-06 03:18:00 Re: HeapTupleSatisfiesDirty fails to test HEAP_XMAX_IS_LOCKED_ONLY for TransactionIdIsInProgress(...)
Previous Message Andres Freund 2013-08-06 03:07:18 Re: HeapTupleSatisfiesDirty fails to test HEAP_XMAX_IS_LOCKED_ONLY for TransactionIdIsInProgress(...)