Drivers: hv: vmbus: Offload the handling of channels to two workqueues
vmbus_process_offer() mustn't call channel->sc_creation_callback()
directly for sub-channels, because sc_creation_callback() ->
vmbus_open() may never get the host's response to the
OPEN_CHANNEL message (the host may rescind a channel at any time,
e.g. in the case of hot removing a NIC), and vmbus_onoffer_rescind()
may not wake up the vmbus_open() as it's blocked due to a non-zero
vmbus_connection.offer_in_progress, and finally we have a deadlock.
The above is also true for primary channels, if the related device
drivers use sync probing mode by default.
And, usually the handling of primary channels and sub-channels can
depend on each other, so we should offload them to different
workqueues to avoid possible deadlock, e.g. in sync-probing mode,
NIC1's netvsc_subchan_work() can race with NIC2's netvsc_probe() ->
rtnl_lock(), and causes deadlock: the former gets the rtnl_lock
and waits for all the sub-channels to appear, but the latter
can't get the rtnl_lock and this blocks the handling of sub-channels.
The patch can fix the multiple-NIC deadlock described above for
v3.x kernels (e.g. RHEL 7.x) which don't support async-probing
of devices, and v4.4, v4.9, v4.14 and v4.18 which support async-probing
but don't enable async-probing for Hyper-V drivers (yet).
The patch can also fix the hang issue in sub-channel's handling described
above for all versions of kernels, including v4.19 and v4.20-rc4.
So actually the patch should be applied to all the existing kernels,
not only the kernels that have 8195b1396ec8.
Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug")
Cc: [email protected]
Cc: Stephen Hemminger <[email protected]>
Cc: K. Y. Srinivasan <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Signed-off-by: Dexuan Cui <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index b3e2436..14131b6 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -905,6 +905,13 @@ struct vmbus_channel {
bool probe_done;
+ /*
+ * We must offload the handling of the primary/sub channels
+ * from the single-threaded vmbus_connection.work_queue to
+ * two different workqueue, otherwise we can block
+ * vmbus_connection.work_queue and hang: see vmbus_process_offer().
+ */
+ struct work_struct add_channel_work;
};
static inline bool is_hvsock_channel(const struct vmbus_channel *c)