Triggering the GC to reclaim non-Java memory

Android applications and libraries commonly allocate “native” (i.e. C++) objects that are effectively owned by a Java object, and reclaimed when the garbage collector determines that the owning Java object is no longer reachable. Various mechanisms are used to accomplish this. These include traditional Java finalizers, more modern Java Cleaners, Android‘s SystemCleaner and, for platform code, NativeAllocationRegistry. Internally, these all rely on the garbage collector’s processing of java.lang.ref.References.

Historically, we have encountered issues when large volumes of such “native” objects are owned by Java objects of significantly smaller size. The Java garbage collector normally decides when to collect based on occupancy of the Java heap. It would not collect if there are few bytes of newly allocated Java objects in the Java heap, even if they “own” large amounts of C++ memory, which cannot be reclaimed until the Java objects are reclaimed.

This led to the development of the VMRuntime.registerNativeAllocation API, and eventually the NativeAllocationRegistry API. Both of these allow the programmer to inform the ART GC that a certain amount of C++ memory had been allocated, and could only be reclaimed as the result of a Java GC.

This had the advantage that the GC would theoretically know exactly how much native memory was owned by Java objects. However, its major problem was that registering the exact size of such native objects frequently turned out to be impractical. Often a Java object does not just own a single native object, but instead owns a complex native data structure composed of many objects in the C++ heap. Their sizes commonly depended on prior computations by C++ code. Some of these might be shared between Java objects, and could only be reclaimed when all of those Java objects became unreachable. Often at least some of the native objects were allocated by third-party libraries that did not make the sizes of its internal objects available to clients.

In extreme cases, underestimation of native object sizes could cause native memory use to be so excessive that the device would become unstable. At one time, a particularly nasty arithmetic expression would cause the Google Calculator app to allocate sufficiently large native arrays of digits backing Java BigIntegers to force a restart of system processes. (This has since been addressed in other ways as well.)

Thus we switched to a scheme in which sizes of native objects are commonly no longer directly provided by clients. The GC instead occasionally calls mallinfo() to determine how much native memory has been allocated, and assumes that any of this may need the collector's help to reclaim. C++ memory allocated by means other than malloc/new and owned by Java objects should still be explicitly registered. This loses information about Java ownership, but results in much more accurate information about the total size of native objects, something that was previously very difficult to approximate, even to within an order of magnitude.

The triggering heuristic

Though we use mallinfo() to track native allocation, this call itself can be expensive, and thus we perform this check fairly rarely. More precisely, we do so only after the application has called NativeAllocationRegistry.registerNativeAllocation() a certain number of times or with a sufficiently large argument, or after VMRuntime.registerNativeAllocation is called. Thus an application not using these APIs, e.g. because it is running almost entirely native code, may never do so. This can be useful, in that an application that runs basically only native code, and thus deallocates its own native memory, does not trigger the GC.

The actual computation for triggering a native-allocation-GC is performed by Heap::NativeMemoryOverTarget(). This computes and compares two quantities:

  1. An adjusted heap size for GC triggering. This consists of the Java heap size at which we would normally trigger a GC plus an allowance for native heap size. This allowance currently consists of one half (background processes) or three halves (foreground processes) of NativeAllocationGcWatermark(). The latter is HeapMaxFree (typically 32MB) plus 1/8 of the currently targeted heap size. For a foreground process, this allowance would typically be in the 50-100 MB range for something other than a low-end device.

  2. An adjusted count of current bytes allocated. This is basically the number of bytes currently allocated in the Java heap, plus half the net number of native bytes allocated since the last GC. The latter is computed as the change since the last GC in the total-bytes-allocated obtained from mallinfo() plus native_bytes_registered_. The native_bytes_registered_ field tracks the bytes explicitly registered, and not yet unregistered via registerNativeAllocation APIs. It excludes bytes registered as malloc-allocated via NativeAllocationRegistry. (The computation also considers the total amount of native memory currently allocated by this metric, as opposed to the change since the last GC. But that is currently de-weighted to the point of insignificance.)

A background GC is triggered if the second quantity exceeds the first. A stop-the-world-GC is triggered if the second quantity is at least 4 times larger than the first, and the native heap currently occupies a large fraction of device memory, suggesting that the GC is falling behind and endangering device usability.

The fact that we consider both Java and native memory use at once means that we are more likely to trigger a native GC when we are closer to the normal Java GC threshold.

The actual use of mallinfo() to compute native heap occupancy reflects experiments with different libc implementations. These have different performance characteristics, and sometimes disagree on the interpretation of struct mallinfo fields. We believe our current implementation is solid with scudo and jemalloc on Android, and minimally usable for testing elsewhere.

(Some of this assumes typical current (May 2024) configuration constants, and may need to be updated.)