Lists: | pgsql-bugs |
---|
From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | pgsql-bugs <pgsql-bugs(at)postgresql(dot)org> |
Subject: | sequential scans that pick up only deleted records do not honor query cancel or timeout |
Date: | 2012-05-22 17:14:04 |
Message-ID: | CAHyXU0xPW=JBur1FvC-ZbKXiLNbVzX6-pHGesfJiTqsMLXpYyQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Basically, $subject says it all. It's pretty easy to reproduce:
delete all the records from a large table and execute any sequentially
scanning query before autocvacuum comes around and cleans the table
up; the query will be uncancellable. This can result in fairly
pathological behavior in i/o constrained systems because the query
will bog itself down writing out hint bits for minutes or hours
without any way to cancel or effective i/o throttling (unlike vacuum).
IMO, this should be backpatched, and is likely fixed by injecting an
interrupts check at a strategic location. But where? I was thinking
in heapgetpage() but here are no checks elsehwere in heapam.c which is
a red flag.
merlin
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
Cc: | pgsql-bugs <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: sequential scans that pick up only deleted records do not honor query cancel or timeout |
Date: | 2012-05-22 21:08:32 |
Message-ID: | 8558.1337720912@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> Basically, $subject says it all. It's pretty easy to reproduce:
> delete all the records from a large table and execute any sequentially
> scanning query before autocvacuum comes around and cleans the table
> up; the query will be uncancellable. This can result in fairly
> pathological behavior in i/o constrained systems because the query
> will bog itself down writing out hint bits for minutes or hours
> without any way to cancel or effective i/o throttling (unlike vacuum).
> IMO, this should be backpatched, and is likely fixed by injecting an
> interrupts check at a strategic location. But where? I was thinking
> in heapgetpage() but here are no checks elsehwere in heapam.c which is
> a red flag.
heapgetpage() seems like the most reasonable place to me, as there we'll
only be making the check once per page not once per tuple.
regards, tom lane
From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-bugs <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: sequential scans that pick up only deleted records do not honor query cancel or timeout |
Date: | 2012-05-22 22:39:02 |
Message-ID: | CAHyXU0xS_0xx92bxsMHfRRgo0e3rS1bg0ngbz9qN=Buee7P32A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
On Tue, May 22, 2012 at 4:08 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> Basically, $subject says it all. It's pretty easy to reproduce:
>> delete all the records from a large table and execute any sequentially
>> scanning query before autocvacuum comes around and cleans the table
>> up; the query will be uncancellable. This can result in fairly
>> pathological behavior in i/o constrained systems because the query
>> will bog itself down writing out hint bits for minutes or hours
>> without any way to cancel or effective i/o throttling (unlike vacuum).
>
>> IMO, this should be backpatched, and is likely fixed by injecting an
>> interrupts check at a strategic location. But where? I was thinking
>> in heapgetpage() but here are no checks elsehwere in heapam.c which is
>> a red flag.
>
> heapgetpage() seems like the most reasonable place to me, as there we'll
> only be making the check once per page not once per tuple.
ok. this fixes the issue:
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
new file mode 100644
index 0d6fe3f..acef385
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
*************** heapgetpage(HeapScanDesc scan, BlockNumb
*** 287,292 ****
--- 287,299 ----
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
+ /*
+ * We have to check for signals here because a long series of
+ * pages containing nothing but deleted tuples can cause control
+ * to remain in the scan loop for an unbounded amount of time.
+ */
+ CHECK_FOR_INTERRUPTS();
+
Assert(ntup <= MaxHeapTuplesPerPage);
scan->rs_ntuples = ntup;
}
merlin
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
Cc: | pgsql-bugs <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: sequential scans that pick up only deleted records do not honor query cancel or timeout |
Date: | 2012-05-22 23:13:12 |
Message-ID: | 12013.1337728392@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> On Tue, May 22, 2012 at 4:08 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> heapgetpage() seems like the most reasonable place to me, as there we'll
>> only be making the check once per page not once per tuple.
> ok. this fixes the issue:
Well, actually it needs to be a bit earlier than that or it won't stop
non-pageatatime scans (cf the return at line 230 in HEAD). I was
thinking to place it at the point where we hold no buffer pin, just to
save a couple cycles in error cleanup:
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 0d6fe3f0acd38a87af759bd3d5196da504202486..0c67156390a6a1bf1fee0bb39b96e0da8cd55dbb 100644
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
*************** heapgetpage(HeapScanDesc scan, BlockNumb
*** 222,227 ****
--- 222,234 ----
scan->rs_cbuf = InvalidBuffer;
}
+ /*
+ * Be sure to check for interrupts at least once per page. Checks at
+ * higher code levels won't be able to stop a seqscan that encounters
+ * many pages' worth of consecutive dead tuples.
+ */
+ CHECK_FOR_INTERRUPTS();
+
/* read page using selected strategy */
scan->rs_cbuf = ReadBufferExtended(scan->rs_rd, MAIN_FORKNUM, page,
RBM_NORMAL, scan->rs_strategy);
But thanks for verifying that a check in this function does fix the
issue for you.
regards, tom lane