Torch Distributed Elastic# Created On: Jun 16, 2025 | Last Updated On: Jul 25, 2025 Makes distributed PyTorch fault-tolerant and elastic. Get Started# Usage Quickstart Train script Examples Documentation# API torchrun (Elastic Launch) Elastic Agent Multiprocessing Error Propagation Rendezvous Expiration Timers Metrics Events Subprocess Handling Control Plane NUMA Binding Utilities Advanced Customization Plugins TorchElastic Kubernetes