runtime/alloc_instrumentation.md - platform/art - Git at Google

 In order to buy some performance on the common, uninstrumented, fast path, we replace repeated
 checks for both allocation instrumentation and allocator changes by  a single function table
 dispatch, and templatized allocation code that can be used to generate either instrumented
 or uninstrumented versions of allocation routines.

 When we call an allocation routine, we always indirect through a thread-local function table that
 either points to instrumented or uninstrumented allocation routines. The instrumented code has a
 `kInstrumented` = true template argument (or `kIsInstrumented` in some places), the uninstrumented
 code has `kInstrumented` = false.

 The function table is thread-local. There appears to be no logical necessity for that; it just
 makes it easier to access from compiled Java code.

 - The function table is switched out by `InstrumentQuickAllocEntryPoints[Locked]`, and a
 corresponding `UninstrumentQuickAlloc`... function.

 - These in turn are called by `SetStatsEnabled()`, `SetAllocationListener()`, et al, which
 require the mutator lock is not held.

 - With a started runtime, `SetEntrypointsInstrumented()` calls `ScopedSupendAll(`) before updating
   the function table.

 Mutual exclusion in the dispatch table is thus ensured by the fact that it is only updated while
 all other threads are suspended, and is only accessed with the mutator lock logically held,
 which inhibits suspension.

 To ensure correctness, we thus must:

 1. Suspend all threads when swapping out the dispatch table, and
 2. Make sure that we hold the mutator lock when accessing it.
 3. Not trust kInstrumented once we've given up the mutator lock, since it could have changed in the
     interim.
	In order to buy some performance on the common, uninstrumented, fast path, we replace repeated
	checks for both allocation instrumentation and allocator changes by a single function table
	dispatch, and templatized allocation code that can be used to generate either instrumented
	or uninstrumented versions of allocation routines.

	When we call an allocation routine, we always indirect through a thread-local function table that
	either points to instrumented or uninstrumented allocation routines. The instrumented code has a
	`kInstrumented` = true template argument (or `kIsInstrumented` in some places), the uninstrumented
	code has `kInstrumented` = false.

	The function table is thread-local. There appears to be no logical necessity for that; it just
	makes it easier to access from compiled Java code.

	- The function table is switched out by `InstrumentQuickAllocEntryPoints[Locked]`, and a
	corresponding `UninstrumentQuickAlloc`... function.

	- These in turn are called by `SetStatsEnabled()`, `SetAllocationListener()`, et al, which
	require the mutator lock is not held.

	- With a started runtime, `SetEntrypointsInstrumented()` calls `ScopedSupendAll(`) before updating
	the function table.

	Mutual exclusion in the dispatch table is thus ensured by the fact that it is only updated while
	all other threads are suspended, and is only accessed with the mutator lock logically held,
	which inhibits suspension.

	To ensure correctness, we thus must:

	1. Suspend all threads when swapping out the dispatch table, and
	2. Make sure that we hold the mutator lock when accessing it.
	3. Not trust kInstrumented once we've given up the mutator lock, since it could have changed in the
	interim.