PufferLib is a fast and sane reinforcement learning library that can train tiny, super-human models in seconds. The included learning algorithm, hyperparameter tuning, and simulation methods are the product of our own research. All our tools are free and open source. Need a high performance environment for your application? We build them professionally and offer training + extended support. Contact jsuarez🐡puffer🐡ai.

The demo below is running live 100% client side in your browser. Hold shift to take control!

Pong

A classic reimagined: Play against our reinforcement learned agent or watch AI vs AI matches. Running at 1M+ steps per second directly in your browser.

BibTeX

Repo / RLC Best Paper 2025 / Original Whitepaper
@misc{pufferlib,
    title        = {{PufferLib}: Fast and Sane Simplifying Reinforcement Learning for Complex Game Environments},
    author       = {Joseph Suarez},
    howpublished = {\url{https://github.com/PufferAI/PufferLib}},
    year         = {2024},
    note         = {GitHub repository}
}

@article{suarez2025pufferlib,
  title   = {{PufferLib} 2.0: {R}einforcement Learning at 1M steps/s},
  author  = {Suarez, Joseph},
  journal = {Reinforcement Learning Journal},
  volume  = {6},
  pages   = {1378--1388},
  year    = {2025}
}

@misc{suarez2024pufferlibmakingreinforcementlearning,
    title={PufferLib: Making Reinforcement Learning
        Libraries and Environments Play Nice},
    author={Joseph Suarez},
    year={2024},
    eprint={2406.12905},
    archivePrefix={arXiv},
    primaryClass={cs.LG},
    url={https://arxiv.org/abs/2406.12905},
}

Contributors

Joseph Suarez Founder & Head Puffer. Writes a lot of code.
Spencer Cheng Go, Connect4, TripleTriad, RWare, Tower Climb
Kyoung Whan Choe (최경환) Mujoco bindings, Testing and bug fixes.
Peru 4.0 Environment vectorization, kernel dev
Jonah kernel dev
David Rubinstein Several performance improvements w/ torch compilation, lead pokerl contributor.
Andrew LeFevre Impulse Wars
Nathan Lichtlé Tactics
Daniel Addis Enduro, testing, bug fixes, outreach, recruitment; major pokerl contributor.
Hadrien Crassous Tetris, freeway
Finlay Sanders Drone
Sam Turner Drone
Kinvert Whisker race
Keelan Donovan Major pokerl contributor.
Gabe Pacman
Joao Abrantes Slimevolley
Yannik 2048
Xander Trash Pickup
Noah Farr Breakout
Jake Forsey Connect4 rewrite with fast minmax AI opponent
David (dmoore101) Improved breakout physics
haterade Website design for demo page
arb8020 Website improvements for environment demos
David Bloomin CARBS integration improvements, 0.4 policy pool/store/selector
Black Ink South Character art for MOBA
Nick Jenkins Layout for the system architecture diagram. Adversary.design
Andranik Tigranyan Streamline and animate the pufferfish. Hire him on UpWork if you like what you see here.
Sara Earle Original pufferfish model. Hire her on UpWork if you like what you see here.