| Triggering the GC to reclaim non-Java memory |
| -------------------------------------------- |
| |
| Android applications and libraries commonly allocate "native" (i.e. C++) objects that are |
| effectively owned by a Java object, and reclaimed when the garbage collector determines that the |
| owning Java object is no longer reachable. Various mechanisms are used to accomplish this. These |
| include traditional Java finalizers, more modern Java `Cleaner`s, Android's `SystemCleaner` and, |
| for platform code, `NativeAllocationRegistry`. Internally, these all rely on the garbage |
| collector's processing of `java.lang.ref.Reference`s. |
| |
| Historically, we have encountered issues when large volumes of such "native" objects are owned by |
| Java objects of significantly smaller size. The Java garbage collector normally decides when to |
| collect based on occupancy of the Java heap. It would not collect if there are few bytes of newly |
| allocated Java objects in the Java heap, even if they "own" large amounts of C++ memory, which |
| cannot be reclaimed until the Java objects are reclaimed. |
| |
| This led to the development of the `VMRuntime.registerNativeAllocation` API, and eventually the |
| `NativeAllocationRegistry` API. Both of these allow the programmer to inform the ART GC that a |
| certain amount of C++ memory had been allocated, and could only be reclaimed as the result of a |
| Java GC. |
| |
| This had the advantage that the GC would theoretically know exactly how much native memory was |
| owned by Java objects. However, its major problem was that registering the exact size of such |
| native objects frequently turned out to be impractical. Often a Java object does not just own a |
| single native object, but instead owns a complex native data structure composed of many objects in |
| the C++ heap. Their sizes commonly depended on prior computations by C++ code. Some of these might |
| be shared between Java objects, and could only be reclaimed when all of those Java objects became |
| unreachable. Often at least some of the native objects were allocated by third-party libraries |
| that did not make the sizes of its internal objects available to clients. |
| |
| In extreme cases, underestimation of native object sizes could cause native memory use to be so |
| excessive that the device would become unstable. At one time, a particularly nasty arithmetic |
| expression would cause the Google Calculator app to allocate sufficiently large native arrays of |
| digits backing Java `BigInteger`s to force a restart of system processes. (This has since been |
| addressed in other ways as well.) |
| |
| Thus we switched to a scheme in which sizes of native objects are commonly no longer directly |
| provided by clients. The GC instead occasionally calls `mallinfo()` to determine how much native |
| memory has been allocated, and assumes that any of this may need the collector's help to reclaim. |
| C++ memory allocated by means other than `malloc`/`new` and owned by Java objects should still be |
| explicitly registered. This loses information about Java ownership, but results in much more |
| accurate information about the total size of native objects, something that was previously very |
| difficult to approximate, even to within an order of magnitude. |
| |
| The triggering heuristic |
| ------------------------ |
| |
| Though we use mallinfo() to track native allocation, this call itself can be expensive, and thus |
| we perform this check fairly rarely. More precisely, we do so only after the application has |
| called `NativeAllocationRegistry.registerNativeAllocation()` a certain number of times or |
| with a sufficiently large argument, or after `VMRuntime.registerNativeAllocation` is called. |
| Thus an application not using these APIs, e.g. because it is running almost entirely native |
| code, may never do so. This can be useful, in that an application that runs basically only |
| native code, and thus deallocates its own native memory, does not trigger the GC. |
| |
| The actual computation for triggering a native-allocation-GC is performed by |
| `Heap::NativeMemoryOverTarget()`. This computes and compares two quantities: |
| |
| 1. An adjusted heap size for GC triggering. This consists of the Java heap size at which we would |
| normally trigger a GC plus an allowance for native heap size. This allowance currently consists |
| of one half (background processes) or three halves (foreground processes) of |
| `NativeAllocationGcWatermark()`. The latter is HeapMaxFree (typically 32MB) plus 1/8 of the |
| currently targeted heap size. For a foreground process, this allowance would typically be in |
| the 50-100 MB range for something other than a low-end device. |
| |
| 2. An adjusted count of current bytes allocated. This is basically the number of bytes currently |
| allocated in the Java heap, plus half the net number of native bytes allocated since the last |
| GC. The latter is computed as the change since the last GC in the total-bytes-allocated |
| obtained from `mallinfo()` plus `native_bytes_registered_`. The `native_bytes_registered_` |
| field tracks the bytes explicitly registered, and not yet unregistered via |
| `registerNativeAllocation` APIs. It excludes bytes registered as malloc-allocated via |
| `NativeAllocationRegistry`. (The computation also considers the total amount of native memory |
| currently allocated by this metric, as opposed to the change since the last GC. But that is |
| currently de-weighted to the point of insignificance.) |
| |
| A background GC is triggered if the second quantity exceeds the first. A stop-the-world-GC is |
| triggered if the second quantity is at least 4 times larger than the first, and the native heap |
| currently occupies a large fraction of device memory, suggesting that the GC is falling behind and |
| endangering device usability. |
| |
| The fact that we consider both Java and native memory use at once means that we are more likely to |
| trigger a native GC when we are closer to the normal Java GC threshold. |
| |
| The actual use of `mallinfo()` to compute native heap occupancy reflects experiments with |
| different `libc` implementations. These have different performance characteristics, and sometimes |
| disagree on the interpretation of `struct mallinfo` fields. We believe our current implementation |
| is solid with scudo and jemalloc on Android, and minimally usable for testing elsewhere. |
| |
| (Some of this assumes typical current (May 2024) configuration constants, and may need to be |
| updated.) |