| Elastic Agent |
| ============== |
| |
| .. automodule:: torch.distributed.elastic.agent |
| .. currentmodule:: torch.distributed.elastic.agent |
| |
| Server |
| -------- |
| |
| .. automodule:: torch.distributed.elastic.agent.server |
| |
| Below is a diagram of an agent that manages a local group of workers. |
| |
| .. image:: agent_diagram.jpg |
| |
| Concepts |
| -------- |
| |
| This section describes the high-level classes and concepts that |
| are relevant to understanding the role of the ``agent`` in torchelastic. |
| |
| .. currentmodule:: torch.distributed.elastic.agent.server |
| |
| .. autoclass:: ElasticAgent |
| :members: |
| |
| .. autoclass:: WorkerSpec |
| :members: |
| |
| .. autoclass:: WorkerState |
| :members: |
| |
| .. autoclass:: Worker |
| :members: |
| |
| .. autoclass:: WorkerGroup |
| :members: |
| |
| Implementations |
| ------------------- |
| |
| Below are the agent implementations provided by torchelastic. |
| |
| .. currentmodule:: torch.distributed.elastic.agent.server.local_elastic_agent |
| .. autoclass:: LocalElasticAgent |
| |
| |
| Extending the Agent |
| --------------------- |
| |
| To extend the agent you can implement ```ElasticAgent`` directly, however |
| we recommend you extend ``SimpleElasticAgent`` instead, which provides |
| most of the scaffolding and leaves you with a few specific abstract methods |
| to implement. |
| |
| .. currentmodule:: torch.distributed.elastic.agent.server |
| .. autoclass:: SimpleElasticAgent |
| :members: |
| :private-members: |
| |
| .. autoclass:: torch.distributed.elastic.agent.server.api.RunResult |