fs/epoll: use a per-cpu counter for user's watches count

This counter tracks the number of watches a user has, to compare against
the 'max_user_watches' limit. This causes a scalability bottleneck on
SPECjbb2015 on large systems as there is only one user. Changing to a
per-cpu counter increases throughput of the benchmark by about 30% on a
16-socket, > 1000 thread system.

[[email protected]: fix build errors in kernel/user.c when CONFIG_EPOLL=n]
[[email protected]: move ifdefs into wrapper functions, slightly improve panic message]
  Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: tweak user_epoll_alloc(), per Guenter]
  Link: https://lkml.kernel.org/r/[email protected]

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Nicholas Piggin <[email protected]>
Reported-by: Anton Blanchard <[email protected]>
Cc: Alexander Viro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
3 files changed