commit | dcba39d352b48c39d503e46a8098d33364889e8f | [log] [tgz] |
---|---|---|
author | Peter Zijlstra <[email protected]> | Tue Oct 10 20:57:39 2023 +0200 |
committer | Dylan Chang <[email protected]> | Thu Dec 28 03:35:22 2023 +0000 |
tree | 254f0878e41451137341c8d5f3799c6304112fde | |
parent | 64bc9d8bbb5de27d254c915804feba0a9b75df72 [diff] |
BACKPORT: sched: Fix stop_one_cpu_nowait() vs hotplug [ Upstream commit f0498d2a54e7966ce23cd7c7ff42c64fa0059b07 ] Kuyo reported sporadic failures on a sched_setaffinity() vs CPU hotplug stress-test -- notably affine_move_task() remains stuck in wait_for_completion(), leading to a hung-task detector warning. Specifically, it was reported that stop_one_cpu_nowait(.fn = migration_cpu_stop) returns false -- this stopper is responsible for the matching complete(). The race scenario is: CPU0 CPU1 // doing _cpu_down() __set_cpus_allowed_ptr() task_rq_lock(); takedown_cpu() stop_machine_cpuslocked(take_cpu_down..) <PREEMPT: cpu_stopper_thread() MULTI_STOP_PREPARE ... __set_cpus_allowed_ptr_locked() affine_move_task() task_rq_unlock(); <PREEMPT: cpu_stopper_thread()\> ack_state() MULTI_STOP_RUN take_cpu_down() __cpu_disable(); stop_machine_park(); stopper->enabled = false; /> /> stop_one_cpu_nowait(.fn = migration_cpu_stop); if (stopper->enabled) // false!!! That is, by doing stop_one_cpu_nowait() after dropping rq-lock, the stopper thread gets a chance to preempt and allows the cpu-down for the target CPU to complete. OTOH, since stop_one_cpu_nowait() / cpu_stop_queue_work() needs to issue a wakeup, it must not be ran under the scheduler locks. Solve this apparent contradiction by keeping preemption disabled over the unlock + queue_stopper combination: preempt_disable(); task_rq_unlock(...); if (!stop_pending) stop_one_cpu_nowait(...) preempt_enable(); This respects the lock ordering contraints while still avoiding the above race. That is, if we find the CPU is online under rq-lock, the targeted stop_one_cpu_nowait() must succeed. Apply this pattern to all similar stop_one_cpu_nowait() invocations. Bug: 317318329 Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Reported-by: "Kuyo Chang (張建文)" <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: "Kuyo Chang (張建文)" <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Dylan Chang <[email protected]> Change-Id: Ib2cc52566f43c3c10f694ce9c1c6a6569b4e2687
BEST: Make all of your changes to upstream Linux. If appropriate, backport to the stable releases. These patches will be merged automatically in the corresponding common kernels. If the patch is already in upstream Linux, post a backport of the patch that conforms to the patch requirements below.
EXPORT_SYMBOL_GPL()
require an in-tree modular driver that uses the symbol -- so include the new driver or changes to an existing driver in the same patchset as the export.LESS GOOD: Develop your patches out-of-tree (from an upstream Linux point-of-view). Unless these are fixing an Android-specific bug, these are very unlikely to be accepted unless they have been coordinated with [email protected]. If you want to proceed, post a patch that conforms to the patch requirements below.
scripts/checkpatch.pl
UPSTREAM:
, BACKPORT:
, FROMGIT:
, FROMLIST:
, or ANDROID:
.Change-Id:
tag (see https://gerrit-review.googlesource.com/Documentation/user-changeid.html)Bug:
tag.Signed-off-by:
tag by the author and the submitterAdditional requirements are listed below based on patch type
UPSTREAM:
, BACKPORT:
UPSTREAM:
.(cherry picked from commit ...)
lineimportant patch from upstream This is the detailed description of the important patch Signed-off-by: Fred Jones <[email protected]>
- then Joe Smith would upload the patch for the common kernel as
UPSTREAM: important patch from upstream This is the detailed description of the important patch Signed-off-by: Fred Jones <[email protected]> Bug: 135791357 Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01 (cherry picked from commit c31e73121f4c1ec41143423ac6ce3ce6dafdcec1) Signed-off-by: Joe Smith <[email protected]>
BACKPORT:
instead of UPSTREAM:
.UPSTREAM:
(cherry picked from commit ...)
lineBACKPORT: important patch from upstream This is the detailed description of the important patch Signed-off-by: Fred Jones <[email protected]> Bug: 135791357 Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01 (cherry picked from commit c31e73121f4c1ec41143423ac6ce3ce6dafdcec1) [joe: Resolved minor conflict in drivers/foo/bar.c ] Signed-off-by: Joe Smith <[email protected]>
FROMGIT:
, FROMLIST:
,FROMGIT:
(cherry picked from commit <sha1> <repo> <branch>)
. This must be a stable maintainer branch (not rebased, so don't use linux-next
for example).BACKPORT: FROMGIT:
important patch from upstream This is the detailed description of the important patch Signed-off-by: Fred Jones <[email protected]>
- then Joe Smith would upload the patch for the common kernel as
FROMGIT: important patch from upstream This is the detailed description of the important patch Signed-off-by: Fred Jones <[email protected]> Bug: 135791357 (cherry picked from commit 878a2fd9de10b03d11d2f622250285c7e63deace https://git.kernel.org/pub/scm/linux/kernel/git/foo/bar.git test-branch) Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01 Signed-off-by: Joe Smith <[email protected]>
FROMLIST:
Link:
tag with a link to the submittal on lore.kernel.orgBug:
tag with the Android bug (required for patches not accepted into a maintainer tree)BACKPORT: FROMLIST:
FROMLIST: important patch from upstream This is the detailed description of the important patch Signed-off-by: Fred Jones <[email protected]> Bug: 135791357 Link: https://lore.kernel.org/lkml/[email protected]/ Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01 Signed-off-by: Joe Smith <[email protected]>
ANDROID:
ANDROID:
Fixes:
tag that cites the patch with the bugANDROID: fix android-specific bug in foobar.c This is the detailed description of the important fix Fixes: 1234abcd2468 ("foobar: add cool feature") Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01 Signed-off-by: Joe Smith <[email protected]>
ANDROID:
Bug:
tag with the Android bug (required for android-specific features)