Pong
A classic reimagined: Play against our reinforcement learned agent or watch AI vs AI matches. Running at 1M+ steps per second directly in your browser.
Ocean
Our growing collection of 1st party environments. All run 1M+ agent steps per second per CPU core, implemented in pure C and playable directly in your browser.
WIP
These environments are not ready yet. Join our Discord to help us finish them!
Sanity
A dead simple suite of test environments that train in seconds and render in your terminal. Each environment is a specific test for a common class of bug or algorithmic limitation.
- Squared: Go to the target square.
- Bandit: Classic multiarm bandit problem.
- Memory: Agent must repeat an observed sequence after a delay.
- Password: Agent must guess a password one step at a time.
- Stochastic: A simple environment where the optimal policy is stochastic.
- Spaces: A simple environment with structued observation and action spaces.
- Multiagent: A simple multiagent environment.
