BinderChannel is a gRPC transport that lets Android apps communicate across processes using familiar gRPC concepts and APIs. A BinderChannel-backed gRPC request can fail for many Android-specific reasons, both at ServiceConnection
establishment and at transact()
time. These transport-specific failures must be reported to clients using gRPC’s standard canonical status code abstraction. This document enumerates the BinderChannel errors one can expect to encounter, specifies a canonical status code mapping for each possibility and discusses how clients should handle them.
Consider the table that follows as an BinderChannel-specific addendum to the “Codes that may be returned by the gRPC libraries” document. Mappings in that table that share a status code with one of the binder-specific mappings are repeated here for comparison.
We say a status code is ambiguous if it maps to two error cases that reasonable clients want to handle differently. For instance, a client may have good reasons to handle error cases 9 and 10 above differently. But they can’t do so based on status code alone because those error cases map to the same one.
In contrast, for example, even though error case 18 and 19 both map to the status code (CANCELLED
), they are not ambiguous because we see no reason that clients would want to distinguish them. In both cases, clients will simply give up on the request.
The mapping above has only one apparently ambiguous status code: PERMISSION_DENIED
. However, this isn’t so bad because of the following:
The use of <android:permission>
s for inter-app IPC access control (error case 10) is uncommon. Instead, we recommend that server apps only allow IPC from a limited set of client apps known in advance and identified by signature.
However, there may be gRPC server apps that want to use custom <android:permission>’s to let the end user decide which arbitrary other apps can make use of its gRPC services. In that case, clients should preempt error case 10 simply by checking whether they hold the required permissions before sending a request.
Server apps can avoid error case 9 by never reusing an android.app.Service as a gRPC host if it has ever been android:exported=false in some previous app version. Instead they should simply create a new android.app.Service for this purpose.
Only error cases 11 - 13 remain, making PERMISSION_DENIED
unambiguous for the purpose of error handling. Reasonable client apps can handle it in a generic way by displaying an error message and/or proceeding with degraded functionality.
The UNIMPLEMENTED
status code corresponds to quite a few different problems with the server app: It’s either not installed, too old, or disabled in whole or in part. Despite the diversity of underlying error cases, we believe most client apps will and should handle UNIMPLEMENTED
in the same way: by sending the user to the app store to (re)install the server app. Reinstalling might be overkill for the disabled cases but most end users don't know what it means to enable/disable an app and there’s neither enough space in a UI dialog nor enough reader attention to explain it. Reinstalling is something users likely already understand and very likely to cure problems 1-8.
According to the docs, false “generally means the transaction code was not understood.” This is true for synchronous transactions but all gRPC/BinderChannel transactions are FLAG_ONEWAY
meaning the calling thread doesn’t wait around for the server to return from onTransact()
. Examination of the AOSP source code shows several additional undocumented reasons transact()
could return false but all of these cases should be impossible and aren’t things that reasonable apps want to handle.
According to the docs, transact()
can throw RemoteException
but the significance of this exception isn’t documented. By inspection of the AOSP source, we see there are several cases:
android.os.DeadObjectException
)android.os.TransactionTooLargeException
)ioctl()
system call.Status code mappings:
According to the docs, this bindService() returns false when “the system couldn‘t find the service or if your client doesn’t have permission to bind to it.” However, the part about permission is somewhat misleading.
According to a review of the AOSP source code, there are in fact several cases:
Status code mapping: UNIMPLEMENTED
(1) and (2) are interesting new possibilities unique to on-device RPC. (1) is straightforward and the most likely cause of (2) is that the user has an old version of the server app installed that predates its gRPC integration. Many clients will want to handle these cases, likely by directing the user to the app store in order to install/upgrade the server.
Unfortunately UNIMPLEMENTED
doesn’t capture (3) but none of the other canonical status codes do either and we expect this case to be extremely rare.
According to the docs, SecurityException is thrown “if the calling app does not have permission to bind to the given service”. There are quite a few specific cases:
android:exported = “false”
in its manifest but the caller is in a different app.android:singleton
in the manifest but doesn’t hold the INTERACT_ACROSS_USERS
permission.android:isolatedProcess
.Status code mapping: PERMISSION_DENIED.
There are a couple cases:
Status Code mapping: INTERNAL. These cases should be impossible.
According to the docs: “... This means the interface will never receive another connection. The application will need to unbind and rebind the connection to activate it again. This may happen, for example, if the application hosting the service it is bound to has been updated.”
Status code mapping: UNAVAILABLE
UNAVAILABLE
is the best mapping since a retry is likely to succeed in the near future once the server application finishes updating.
According to the docs: “Called when the service being bound has returned null from its onBind()
method. This indicates that the attempting service binding represented by this ServiceConnection
will never become usable.”
Status code mapping: UNIMPLEMENTED
UNIMPLEMENTED
is used here because a retry is likely to fail for the same reason. The most likely root cause for a null binding is an older version of the server app where the android.app.Service
exists but either doesn’t implement onBind()
or doesn’t recognize the grpc.io.action.BIND
Intent action.
According to the docs: “Called when a connection to the Service has been lost. This typically happens when the process hosting the service has crashed or been killed ...”
Status code mapping: UNAVAILABLE
UNAVAILABLE
is used here since a retry is likely to succeed against a newly restarted instance of the server.
Android’s Parcel class exposes a mechanism for marshalling certain types of RuntimeException
s between traditional Binder IPC peers. However we won’t consider this case because gRPC/BinderChannel doesn’t use this mechanism. In fact, all BinderChannel transactions are FLAG_ONE_WAY
so there is no response Parcel.
The calling Activity or Service Context might be destroyed with a gRPC request in flight. Apps should cease operations when the Context hosting it goes away and this includes cancelling any outstanding RPCs.
Status code mapping: CANCELLED