From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Keith Fiske <keith(at)omniti(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, levertond(at)googlemail(dot)com, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: BUG #10533: 9.4 beta1 assertion failure in autovacuum process |
Date: | 2014-06-19 23:30:59 |
Message-ID: | 20140619233059.GF16260@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On 2014-06-06 18:21:45 -0400, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On 2014-06-06 18:03:53 -0400, Tom Lane wrote:
> >> The point here seems to be that lazy_vacuum_page does the visibility map
> >> ops inside its own critical section. Why? Setting a visibility bit
> >> doesn't seem like it's critical. Why can't we just move the
> >> END_CRIT_SECTION() to before the PageIsAllVisible test?
>
> > Yea, that's what I am proposing upthread. If we move the visibility
> > tests out of the critical section this will get rid of the original
> > report as well.
>
> I went trolling for other critical sections ...
>
> lazy_scan_heap has same disease, but looks like it can be fixed same way.
Hm. I'm writing a fix for this issue right now and went to look into
lazy_scan_heap() - but I'm not seeing a problem atm. Do you mean that
bit:
START_CRIT_SECTION();
/* mark buffer dirty before writing a WAL record */
MarkBufferDirty(buf);
/*
* It's possible that another backend has extended the heap,
* initialized the page, and then failed to WAL-log the page
* due to an ERROR. Since heap extension is not WAL-logged,
* recovery might try to replay our record setting the page
* all-visible and find that the page isn't initialized, which
* will cause a PANIC. To prevent that, check whether the
* page has been previously WAL-logged, and if not, do that
* now.
*/
if (RelationNeedsWAL(onerel) &&
PageGetLSN(page) == InvalidXLogRecPtr)
log_newpage_buffer(buf, true);
PageSetAllVisible(page);
visibilitymap_set(onerel, blkno, buf, InvalidXLogRecPtr,
vmbuffer, InvalidTransactionId);
END_CRIT_SECTION();
If so, I think that should actually be safe - visibilitymap_set()
shouldn't ever do IO - that's why it gets the vmbuffer argument passed
in.
Now, afaics there isn't really any need for the critical section in this
case but the assertion in log_newpage_buffer(). So, afaics, we could
just end the do the END_CRIT_SECTION() two lines earlier, to keep
it as short as possible.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2014-06-19 23:59:57 | Re: BUG #10533: 9.4 beta1 assertion failure in autovacuum process |
Previous Message | Alvaro Herrera | 2014-06-19 22:12:41 | Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts |