Messages in this thread Patch in this message |  | Date | Fri, 24 Oct 2014 11:07:46 -0400 | From | Burke Libbey <> | Subject | [PATCH] sched: reset sched_entity depth on changing parent |
| |
From 2014-02-15: https://lkml.org/lkml/2014/2/15/217
This issue was reported and patched, but it still occurs in some situations on newer kernel versions.
[2249353.328452] BUG: unable to handle kernel NULL pointer dereference at 0000000000000150 [2249353.336528] IP: [<ffffffff810b1cf7>] check_preempt_wakeup+0xe7/0x210
se.parent gets out of sync with se.depth, causing a panic when the algorithm in find_matching_se assumes they are correct. This patch forces se.depth to be updated every time se.parent is, so they can no longer become desync'd.
CC: Ingo Molnar <[email protected]> CC: Peter Zijlstra <[email protected]> Signed-off-by: Burke Libbey <[email protected]> ---
I haven't been able to isolate the problem. Though I'm pretty confident this fixes the issue I've been having, I have not been able to prove it.
I kept a debugging journal if you want more context for whatever reason: https://gist.github.com/burke/c60dc5b8f0ba9bfd9275
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 24156c84..bcffba2 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -844,6 +844,7 @@ static inline void set_task_rq(struct task_struct *p, unsigned int cpu) #ifdef CONFIG_FAIR_GROUP_SCHED p->se.cfs_rq = tg->cfs_rq[cpu]; p->se.parent = tg->se[cpu]; + p->se.depth = p->se.parent ? p->se.parent->depth + 1 : 0; #endif #ifdef CONFIG_RT_GROUP_SCHED [unhandled content-type:application/pgp-signature] |  |