Ray is a high-performance distributed execution framework targeted at large-scale machine learning and reinforcement learning applications. It achieves scalability and fault tolerance by abstracting the control state of the system in a global control store and keeping all other components stateless. It uses a shared-memory distributed object store to efficiently handle large data through shared memory, and it uses a bottom-up hierarchical scheduling architecture to achieve low-latency and high-throughput scheduling. It uses a lightweight API based on dynamic task graphs and actors to express a wide range of applications in a flexible manner.
Check out the following links!
Codebase: https://github.com/ray-project/ray Documentation: http://ray.readthedocs.io/en/latest/index.html Tutorial: https://github.com/ray-project/tutorial Blog: https://ray-project.github.io Mailing list: [email protected]