| Load Balancing in gRPC |
| ====================== |
| |
| # Scope |
| |
| This document explains the design for load balancing within gRPC. |
| |
| # Background |
| |
| ## Per-Call Load Balancing |
| |
| It is worth noting that load-balancing within gRPC happens on a per-call |
| basis, not a per-connection basis. In other words, even if all requests |
| come from a single client, we still want them to be load-balanced across |
| all servers. |
| |
| ## Approaches to Load Balancing |
| |
| Prior to any gRPC specifics, we explore some usual ways to approach load |
| balancing. |
| |
| ### Proxy Model |
| |
| Using a proxy provides a solid trustable client that can report load to the load |
| balancing system. Proxies typically require more resources to operate since they |
| have temporary copies of the RPC request and response. This model also increases |
| latency to the RPCs. |
| |
| The proxy model was deemed inefficient when considering request heavy services |
| like storage. |
| |
| ### Balancing-aware Client |
| |
| This thicker client places more of the load balancing logic in the client. For |
| example, the client could contain many load balancing policies (Round Robin, |
| Random, etc) used to select servers from a list. In this model, a list of |
| servers would be either statically configured in the client, provided by the |
| name resolution system, an external load balancer, etc. In any case, the client |
| is responsible for choosing the preferred server from the list. |
| |
| One of the drawbacks of this approach is writing and maintaining the load |
| balancing policies in multiple languages and/or versions of the clients. These |
| policies can be fairly complicated. Some of the algorithms also require client |
| to server communication so the client would need to get thicker to support |
| additional RPCs to get health or load information in addition to sending RPCs |
| for user requests. |
| |
| It would also significantly complicate the client's code: the new design hides |
| the load balancing complexity of multiple layers and presents it as a simple |
| list of servers to the client. |
| |
| ### External Load Balancing Service |
| |
| The client load balancing code is kept simple and portable, implementing |
| well-known algorithms (e.g., Round Robin) for server selection. |
| Complex load balancing algorithms are instead provided by the load |
| balancer. The client relies on the load balancer to provide _load |
| balancing configuration_ and _the list of servers_ to which the client |
| should send requests. The balancer updates the server list as needed |
| to balance the load as well as handle server unavailability or health |
| issues. The load balancer will make any necessary complex decisions and |
| inform the client. The load balancer may communicate with the backend |
| servers to collect load and health information. |
| |
| # Requirements |
| |
| ## Simple API and client |
| |
| The gRPC client load balancing code must be simple and portable. The |
| client should only contain simple algorithms (e.g., Round Robin) for |
| server selection. For complex algorithms, the client should rely on |
| a load balancer to provide load balancing configuration and the list of |
| servers to which the client should send requests. The balancer will update |
| the server list as needed to balance the load as well as handle server |
| unavailability or health issues. The load balancer will make any necessary |
| complex decisions and inform the client. The load balancer may communicate |
| with the backend servers to collect load and health information. |
| |
| ## Security |
| |
| The load balancer may be separate from the actual server backends and a |
| compromise of the load balancer should only lead to a compromise of the |
| loadbalancing functionality. In other words, a compromised load balancer should |
| not be able to cause a client to trust a (potentially malicious) backend server |
| any more than in a comparable situation without loadbalancing. |
| |
| # Architecture |
| |
| ## Overview |
| |
| The primary mechanism for load-balancing in gRPC is external |
| load-balancing, where an external load balancer provides simple clients |
| with an up-to-date list of servers. |
| |
| The gRPC client does support an API for built-in load balancing policies. |
| However, there are only a small number of these (one of which is the |
| `grpclb` policy, which implements external load balancing), and users |
| are discouraged from trying to extend gRPC by adding more. Instead, new |
| load balancing policies should be implemented in external load balancers. |
| |
| ## Workflow |
| |
| Load-balancing policies fit into the gRPC client workflow in between |
| name resolution and the connection to the server. Here's how it all |
| works: |
| |
|  |
| |
| 1. On startup, the gRPC client issues a [name resolution](naming.md) request |
| for the server name. The name will resolve to one or more IP addresses, |
| each of which will indicate whether it is a server address or |
| a load balancer address, and a [service config](service_config.md) |
| that indicates which client-side load-balancing policy to use (e.g., |
| `round_robin` or `grpclb`). |
| 2. The client instantiates the load balancing policy. |
| - Note: If any one of the addresses returned by the resolver is a balancer |
| address, then the client will use the `grpclb` policy, regardless |
| of what load-balancing policy was requested by the service config. |
| Otherwise, the client will use the load-balancing policy requested |
| by the service config. If no load-balancing policy is requested |
| by the service config, then the client will default to a policy |
| that picks the first available server address. |
| 3. The load balancing policy creates a subchannel to each server address. |
| - For all policies *except* `grpclb`, this means one subchannel for each |
| address returned by the resolver. Note that these policies |
| ignore any balancer addresses returned by the resolver. |
| - In the case of the `grpclb` policy, the workflow is as follows: |
| 1. The policy opens a stream to one of the balancer addresses returned |
| by the resolver. It asks the balancer for the server addresses to |
| use for the server name originally requested by the client (i.e., |
| the same one originally passed to the name resolver). |
| - Note: In the `grpclb` policy, the non-balancer addresses returned |
| by the resolver are used as a fallback in case no balancers can be |
| contacted when the LB policy is started. |
| 2. The gRPC servers to which the load balancer is directing the client |
| may report load to the load balancers, if that information is needed |
| by the load balancer's configuration. |
| 3. The load balancer returns a server list to the gRPC client's `grpclb` |
| policy. The `grpclb` policy will then create a subchannel to each of |
| server in the list. |
| 4. For each RPC sent, the load balancing policy decides which |
| subchannel (i.e., which server) the RPC should be sent to. |
| - In the case of the `grpclb` policy, the client will send requests |
| to the servers in the order in which they were returned by the load |
| balancer. If the server list is empty, the call will block until a |
| non-empty one is received. |