Skip site navigation (1) Skip section navigation (2)

Peripheral Links

Header And Logo

PostgreSQL
| The world's most advanced open source database.

Site Navigation

Search for
  Advanced Search

Re: ERROR: invalid memory alloc request size <a_big_number_here>


  • From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
  • To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
  • Cc: Matteo Beccati <php(at)beccati(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, pgsql-hackers(at)postgresql(dot)org
  • Subject: Re: ERROR: invalid memory alloc request size <a_big_number_here>
  • Date: Thu, 27 Oct 2005 12:57:58 -0300
  • Message-id: <20051027155758(dot)GA15119(at)surnet(dot)cl>

Tom Lane wrote:
> Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> > I think the problem is that CreateMultiXactId calls
> > GetNewMultiXactId and then RecordNewMultiXact, and the lock is released
> > between the calls.  So one backend could try to read the offset before
> > another one had the time to finish writing it.
> 
> Ugh, yes, that is clearly a hole :-( even if it turns out not to explain
> Matteo's observation.
> 
> I don't see any easy way to fix this except by introducing a lot more
> locking than is there now --- ie, holding the MultiXactGenLock until the
> new mxact's starting offset has been written to disk.  Any better ideas?

Well, it isn't a very good solution because it requires us to retain the
MultiXactGenLock past a XLogInsert and some I/O on SLRU pages.
Previously the lock was mostly only used in short operations and very
rarely held during I/O.  But I don't see any other solution either.
Patch attached.

I confess being attracted to Martijn's idea of looping until the correct
answer is obtained.  I don't think it's even too difficult to implement.
But I wonder if there's some hidden pitfall.

Thanks to Matteo for finding the bug!

-- 
Alvaro Herrera                                http://www.PlanetPostgreSQL.org
"El número de instalaciones de UNIX se ha elevado a 10,
y se espera que este número aumente" (UPM, 1972)
Index: src/backend/access/transam/multixact.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/multixact.c,v
retrieving revision 1.9
diff -c -r1.9 multixact.c
*** src/backend/access/transam/multixact.c	15 Oct 2005 02:49:09 -0000	1.9
--- src/backend/access/transam/multixact.c	27 Oct 2005 15:45:24 -0000
***************
*** 633,638 ****
--- 633,639 ----
  	/*
  	 * OK, assign the MXID and offsets range to use
  	 */
+ 	LWLockAcquire(MultiXactGenLock, LW_EXCLUSIVE);
  	multi = GetNewMultiXactId(nxids, &offset);
  
  	debug_elog4(DEBUG2, "Create: assigned id %u offset %u", multi, offset);
***************
*** 665,671 ****
  
  	(void) XLogInsert(RM_MULTIXACT_ID, XLOG_MULTIXACT_CREATE_ID, rdata);
  
! 	/* Now enter the information into the OFFSETs and MEMBERs logs */
  	RecordNewMultiXact(multi, offset, nxids, xids);
  
  	/* Store the new MultiXactId in the local cache, too */
--- 666,675 ----
  
  	(void) XLogInsert(RM_MULTIXACT_ID, XLOG_MULTIXACT_CREATE_ID, rdata);
  
! 	/*
! 	 * Now enter the information into the OFFSETs and MEMBERs logs.
! 	 * MultiXactGenLock is released here.
! 	 */
  	RecordNewMultiXact(multi, offset, nxids, xids);
  
  	/* Store the new MultiXactId in the local cache, too */
***************
*** 681,686 ****
--- 685,693 ----
   *		Write info about a new multixact into the offsets and members files
   *
   * This is broken out of CreateMultiXactId so that xlog replay can use it.
+  *
+  * The caller is assumed to hold the MultiXactGenLock, which will be
+  * released by this routine.
   */
  static void
  RecordNewMultiXact(MultiXactId multi, MultiXactOffset offset,
***************
*** 713,718 ****
--- 720,731 ----
  
  	MultiXactOffsetCtl->shared->page_status[slotno] = SLRU_PAGE_DIRTY;
  
+ 	/*
+ 	 * Now that the offset has been written, we can release the
+ 	 * MultiXactGenLock.
+ 	 */
+ 	LWLockRelease(MultiXactGenLock);
+ 
  	/* Exchange our lock */
  	LWLockRelease(MultiXactOffsetControlLock);
  
***************
*** 756,761 ****
--- 769,777 ----
   * files.  Unfortunately, we have to do that while holding MultiXactGenLock
   * to avoid race conditions --- the XLOG record for zeroing a page must appear
   * before any backend can possibly try to store data in that page!
+  *
+  * The caller is assumed to hold the MultiXactGenLock, and it will be held
+  * at exit.
   */
  static MultiXactId
  GetNewMultiXactId(int nxids, MultiXactOffset *offset)
***************
*** 767,774 ****
  	/* MultiXactIdSetOldestMember() must have been called already */
  	Assert(MultiXactIdIsValid(OldestMemberMXactId[MyBackendId]));
  
- 	LWLockAcquire(MultiXactGenLock, LW_EXCLUSIVE);
- 
  	/* Handle wraparound of the nextMXact counter */
  	if (MultiXactState->nextMXact < FirstMultiXactId)
  		MultiXactState->nextMXact = FirstMultiXactId;
--- 783,788 ----
***************
*** 800,807 ****
  
  	MultiXactState->nextOffset += nxids;
  
- 	LWLockRelease(MultiXactGenLock);
- 
  	debug_elog4(DEBUG2, "GetNew: returning %u offset %u", result, *offset);
  	return result;
  }
--- 814,819 ----
***************
*** 1777,1782 ****
--- 1789,1795 ----
  		int			i;
  
  		/* Store the data back into the SLRU files */
+ 		LWLockAcquire(MultiXactGenLock, LW_EXCLUSIVE);
  		RecordNewMultiXact(xlrec->mid, xlrec->moff, xlrec->nxids, xids);
  
  		/* Make sure nextMXact/nextOffset are beyond what this record has */


Home | Main Index | Thread Index

Privacy Policy | PostgreSQL Archives hosted by Command Prompt, Inc. | Designed by tinysofa
Copyright © 1996 – 2008 PostgreSQL Global Development Group