Thursday, May 21, 2026
HomeAI NewsAI AgentsAndrej Karpathy Unveils Autoresearch, an Open-Source Framework for Autonomous ML Experimentation

Andrej Karpathy Unveils Autoresearch, an Open-Source Framework for Autonomous ML Experimentation

Andrej Karpathy has released Autoresearch, an open-source framework that turns ML experimentation into an autonomous loop. The 630-line Python tool lets AI agents read their own source code, form hypotheses for improvement, modify the code, run experiments, and commit successful changes via Git, all without human intervention.

The framework, published in early March 2026, has already run 50 experiments overnight on a single GPU. Each dot in Karpathy’s visualisation represents an independent experiment: the agent proposes a change, such as adjusting a learning rate or modifying architecture depth, runs the training loop, evaluates the result, and decides whether to keep or discard the modification.

View Github

One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of “group meeting”. That era is long gone. Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures in the skies. The agents claim that we are now in the 10,205th generation of the code base, in any case no one could tell if that’s right or wrong as the “code” is now a self-modifying binary that has grown beyond human comprehension. This repo is the story of how it all began. -@karpathy, March 2026.

How does Autoresearch work?

The design is deliberately minimal. Autoresearch reads its own codebase, generates a hypothesis for improvement, implements the change, runs the experiment, and evaluates results against a metric. Successful experiments are committed to Git. Failed ones are discarded. The loop continues autonomously until the GPU runs out of time or the user stops it.

What makes the framework interesting is not its complexity but its simplicity. At 630 lines of Python, it strips the agentic experimentation concept down to its essential loop. There is no elaborate orchestration, no complex agent framework dependency. Just a tight read-modify-evaluate-commit cycle that a single GPU can sustain overnight.

Beyond ML research

The design pattern behind Autoresearch applies well beyond machine learning. Any domain where code can be modified, evaluated against a metric, and iterated on autonomously fits the model. Software testing, optimisation problems, and configuration tuning are all candidates for the same approach.

Karpathy’s framework also highlights a broader trend: AI agents that improve themselves. The loop of proposing changes, evaluating them, and committing winners is a primitive form of self-improvement. As agent frameworks grow more capable, the question is whether this pattern scales from single-GPU ML experiments to larger, more consequential domains.

This article is for informational purposes only and does not constitute financial, investment, or professional advice.

Recent Crypto News

Page 1
Related Articles

Recent News