Android API client-side caching guidelines

go/android-api-caching

Motivation

Android API calls typically involve non-negligible latency and computation per invocation. Client-side caching is therefore an important consideration in designing APIs that are helpful, correct, and performant.

APIs exposed to app developers via the Android SDK are often implemented as client code in the Android Framework that makes a Binder IPC call to a system service in a platform process, whose job it is to perform some computation and return a result to the client. The latency of this operation is typically dominated by three factors:

  1. IPC overhead: a simple IPC call is typically 10,000x the latency of a simple in-process method call.
  2. Server-side contention: the work done in the system service in response to the client's request may not start immediately, for instance if a server thread is busy handling other requests that arrived earlier.
  3. Server-side computation: the work itself to handle the request in the server may require non-trivial work.

You can eliminate all three of these latency factors by implementing a cache on the client side, provided that the cache is:

  • Correct: the client-side cache never returns results that would be different than what the server would have returned.
  • Effective: client requests are often served from the cache, i.e. the cache has a high hit rate.
  • Efficient: the client-side cache makes efficient use of client-side resources, such as by representing cached data in a compact way and by not storing too many cached results or stale data in the client's memory.

Consider caching server results in the client

If clients often make the exact same request multiple times, and the value returned doesn't change over time, then you should implement a cache in the client library keyed by the request parameters.

Consider using IpcDataCache in your implementation:

public class BirthdayManager {
    private final IpcDataCache.QueryHandler<User, Birthday> mBirthdayQuery =
            new IpcDataCache.QueryHandler<User, Birthday>() {
                @Override
                public Birthday apply(User user) {
                    return mService.getBirthday(user);
                }
            };
    private static final int BDAY_CACHE_MAX = 8;  // Maximum birthdays to cache
    private static final String BDAY_API = "getUserBirthday";
    private final IpcDataCache<User, Birthday> mCache
            new IpcDataCache<User, Birthday>(
                BDAY_CACHE_MAX, MODULE_SYSTEM, BDAY_API,  BDAY_API, mBirthdayQuery);

    /** @hide **/
    @VisibleForTesting
    public static void clearCache() {
        IpcDataCache.invalidateCache(MODULE_SYSTEM, BDAY_API);
    }

    public Birthday getBirthday(User user) {
        return mCache.query(user);
    }
}

For a complete example see for instance android.app.admin.DevicePolicyManager.

IpcDataCache is available to all system code, including mainline modules. There is also PropertyInvalidatedCache which is nearly identical, but is only visible to the framework. Prefer IpcDataCache when possible.

Invalidate caches on server-side changes

If the value returned from the server can change over time, implement a callback for observing changes, and register a callback so that you may invalidate the client-side cache accordingly.

Invalidate caches between unit test cases

In a unit test suite, you might test the client code against a test double rather than the real server. If so, then be sure to clear any client-side caches between test cases. This is to keep test cases mutually hermetic, and prevent one test case from interfering with another.

@RunWith(AndroidJUnit4.class)
public class BirthdayManagerTest {

    @Before
    public void setUp() {
        BirthdayManager.clearCache();
    }

    @After
    public void tearDown() {
        BirthdayManager.clearCache();
    }

    ...
}

When writing CTS tests that exercise an API client that uses caching internally, the cache is an implementation detail that is not exposed to the API author, therefore CTS tests should not require any special knowledge of caching used in client code.

Study cache hits and misses

IpcDataCache and PropertyInvalidatedCache can print live statistics:

adb shell dumpsys cacheinfo
  ...
  Cache Name: cache_key.is_compat_change_enabled
    Property: cache_key.is_compat_change_enabled
    Hits: 1301458, Misses: 21387, Skips: 0, Clears: 39
    Skip-corked: 0, Skip-unset: 0, Skip-bypass: 0, Skip-other: 0
    Nonce: 0x856e911694198091, Invalidates: 72, CorkedInvalidates: 0
    Current Size: 1254, Max Size: 2048, HW Mark: 2049, Overflows: 310
    Enabled: true
  ...

Fields:

Hits:

  • Definition: The number of times a requested piece of data was successfully found within the cache.
  • Significance: Indicates an efficient and fast retrieval of data, reducing unnecessary data retrieval.
  • Higher counts are generally better.

Clears:

  • Definition: The number of times the cache was cleared because of invalidation.
  • Reasons for Clearing:
    • Invalidation: Outdated data from the server.
    • Space Management: Making room for new data when the cache is full.
  • High counts could indicate frequently changing data and potential inefficiency.

Misses:

  • Definition: The number of times the cache failed to provide the requested data.
  • Causes:
    • Inefficient caching: Cache too small or not storing the right data.
    • Frequently changing data.
    • First-time requests.
  • High counts suggest potential caching issues.

Skips:

  • Definition: Instances where the cache was not used at all, even though it could have been.
  • Reasons for Skipping:
    • “Corking”: Specific to Android Package Manager updates, deliberately turning off caching because of a high volume of calls during boot.
    • “Unset”: Cache exists but not initialized. The nonce was unset, which means the cache has never been invalidated.
    • “Bypass”: Intentional decision to skip the cache.
  • High counts indicate potential inefficiencies in cache usage.

Invalidates:

  • Definition: The process of marking cached data as outdated or stale.
  • Significance: Ensures the system works with the most up-to-date data, preventing errors and inconsistencies.
  • Typically triggered by the server that owns the data.

Current Size:

  • Definition: The current amount of elements in cache.
  • Significance: Indicates the cache's resource utilization and potential impact on system performance.
  • Higher values generally mean more memory is used by the cache.

Max Size:

  • Definition: The maximum amount of space allocated for the cache.
  • Significance: Determines the cache's capacity and its ability to store data.
  • Setting an appropriate max size helps balance cache efficacy with memory usage. Once the maximum size is reached, a new element is added by evicting the least-recently used element, which can indicate inefficiency.

High Water Mark:

  • Definition: The maximum size reached by the cache since its creation.
  • Significance: Provides insights into peak cache usage and potential memory pressure.
  • Monitoring the high water mark can help identify potential bottlenecks or areas for optimization.

Overflows:

  • Definition: The number of times the cache exceeded its max size and had to evict data to make room for new entries.
  • Significance: Indicates cache pressure and potential performance degradation due to data eviction.
  • High overflow counts suggest the cache size may need to be adjusted or the caching strategy reevaluated.

The same stats can also be found in a bugreport.

Tune the size of the cache

Caches have a maximum size. When the maximum cache size is exceeded, entries are evicted in LRU order.

  • Caching too few entries could negatively affect the cache hit rate.
  • Caching too many entries increases the cache's memory usage.

Find the right balance for your use case.

Eliminate redundant client calls

Clients may make the same query to the server multiple times in a short span:

public void executeAll(List<Operation> operations) throws SecurityException {
    for (Operation op : operations) {
        for (Permission permission : op.requiredPermissions()) {
            if (!permissionChecker.checkPermission(permission, ...)) {
                throw new SecurityException("Missing permission " + permission);
            }
        }
        op.execute();
  }
}

Consider reusing the results from previous calls:

public void executeAll(List<Operation> operations) throws SecurityException {
    Set<Permission> permissionsChecked = new HashSet<>();
    for (Operation op : operations) {
        for (Permission permission : op.requiredPermissions()) {
            if (!permissionsChecked.add(permission)) {
                if (!permissionChecker.checkPermission(permission, ...)) {
                    throw new SecurityException(
                            "Missing permission " + permission);
                }
            }
        }
        op.execute();
  }
}

Example: Caching permission check results in ContentProvider#applyBatch

Consider client-side memoization of recent server responses

Client apps may query the API at a faster rate than the API's server can produce meaningfully new responses. In this case, an effective approach is to memoize the last seen server response at the client side along with a timestamp, and to return the memoized result without querying the server if the memoized result is recent enough. The API client author can determine the memoization duration.

For instance, an app may display network traffic statistics to the user by querying for the stats in every frame drawn:

@UiThread
private void setStats() {
    mobileRxBytesTextView.setText(
        Long.toString(TrafficStats.getMobileRxBytes()));
    mobileRxPacketsTextView.setText(
        Long.toString(TrafficStats.getMobileRxPackages()));
    mobileTxBytesTextView.setText(
        Long.toString(TrafficStats.getMobileTxBytes()));
    mobileTxPacketsTextView.setText(
        Long.toString(TrafficStats.getMobileTxPackages()));
}

The app may draw frames at 60Hz. But hypothetically, the client code in TrafficStats may choose to query the server for stats at most once per second, and if queried within a second of a previous query, return the last seen value. This is allowed since the API documentation doesn't make any guarantees about the freshness of the results returned.

participant App code as app
participant Client library as clib
participant Server as server

app->clib: request @ T=100ms
clib->server: request
server->clib: response 1
clib->app: response 1

app->clib: request @ T=200ms
clib->app: response 1

app->clib: request @ T=300ms
clib->app: response 1

app->clib: request @ T=2000ms
clib->server: request
server->clib: response 2
clib->app: response 2

Consider client-side codegen instead of server queries

If the query results are knowable to the server at build time, then consider if they are knowable to the client at build time as well, and consider whether the API could be implemented entirely in the client side.

Consider the following app code that checks if the device is a watch (aka the device is running Wear OS):

public boolean isWatch(Context ctx) {
    PackageManager pm = ctx.getPackageManager();
    return pm.hasSystemFeature(PackageManager.FEATURE_WATCH);
}

This property of the device is known at build time, specifically at the time that the Framework was built for this device's boot image. The client-side code for hasSystemFeature could return a known result immediately, rather than querying the remote PackageManager system service.

See: Android Platform: Build-time Optimizations for System Features

Deduplicate server callbacks in the client

Lastly, the API client may register callbacks with the API server to be notified of events.

It's typical for apps to register multiple callbacks for the same underlying information. Rather than have the server notify the client once per registered callback via IPC, the client library should have one registered callback via IPC with the server, and then notify each registered callback in the app.

digraph d_front_back {
  rankdir=RL;
  node [style=filled, shape="rectangle", fontcolor="white" fontname="Roboto"]
  server->clib
  clib->c1;
  clib->c2;
  clib->c3;

  subgraph cluster_client {
    graph [style="dashed", label="Client app process"];
    c1 [label="my.app.FirstCallback" color="#4285F4"];
    c2 [label="my.app.SecondCallback" color="#4285F4"];
    c3 [label="my.app.ThirdCallback" color="#4285F4"];
    clib [label="android.app.FooManager" color="#F4B400"];
  }

  subgraph cluster_server {
    graph [style="dashed", label="Server process"];
    server [label="com.android.server.FooManagerService" color="#0F9D58"];
  }
}

Review binder telemetry

Data from droidfood and beta population can help you identify caching opportunities, or opportunities to improve the efficacy of existing caches through tuning.

Binder call counts and latencies

Look in Pitot Binder metrics for trace-based metrics on binder call counts and binder call latencies. Binder interfaces with a high volume of calls or a high latency per call are good candidates for caching, provided that they meet the other properties listed above.

You will find this page particularly helpful: distribution of binder calls count on battery

Additionally, you can use the go/trace-binder-spam dashboard to find examples of binder spam from field traces. Once you found a spammed interface, you can open associated traces to find relevant anecdotes that you can study in detail.

Pitot binder calls count dashboard

In the screenshot above we see a list of interfaces ordered by highest calls count first. You will notice that there are several IPackageManager interfaces. Some, like setComponentEnabledSetting, are not candidates for caching because they are mutative. Others, like hasSystemFeature, are excellent candidates for caching, since they read a property with a clear lifetime definition.

Cache sizes

The Java heap classes dashboard can be used to find examples where a cache is dominating more memory than is expected, and suggest tuning opportunities.

Java heap classes dashboard

In the screenshot above we see heap dumps with high amounts of memory retained by an inner class of PropertyInvalidatedCache. Click on a heap dump to see a dominator tree.

Heap dominator tree

In the screenshot above we see the dominator tree for a particular heap dump. Notice that you can see a large cache in ChangeIdStateCache, and another in PermissionManager. Are these sizes expected, or too large?

Recall that you can also inspect the cache hit rate at runtime using adb shell dumpsys cacheinfo. A large cache doesn't necessarily mean that the hit rate is high.