Artificial Intelligence is creating the first generation of virtual scientists
A fully automated AI researcher has produced a paper that meets scientific standards.
Scientific research can be a long and tedious process. And also frustrating, since funding is limited, as is time. One solution to make science more efficient? Have it done by a robot researcher, an AI colleague. At least in theory.
The First Automated AI Researcher
In 2024, the Tokyo-based startup Sakana.ai unveiled “The AI ????Scientist,” an artificial intelligence system capable of creating new machine learning research from scratch, completely autonomously, for just $15 per article.
The model can go through the entire research process without any human assistance: from creating new hypotheses to running the code and writing up the results.
And it goes even further: it has its own peer-review system that automatically assesses the quality of the article, ensuring it meets scientific standards.
When an independent team of researchers tested the 2024 version of the system, they found the quality of its results to be quite low. Although it was indeed able to carry out the entire research cycle on its own, the result was—as the authors say—like that of “a demotivated college student rushing to meet a deadline.”
The most worrying issues that emerged from the experiment were incomplete sections of the work, outdated or limited references, and incorrect—or even fabricated—numerical results, often referred to as “hallucinations” in AI terms.
Even so, the researchers saw potential in the system, especially for its efficiency. They calculated that what would have taken that “demotivated college student” at least 20 hours, the AI ????colleague was able to do in just 3.5. And all this for an average cost of between $6 and $15.
An AI-generated paper accepted at a conference workshop
A year and a half later, Sakana.ai has tested its latest version of the system: Three AI-generated papers—along with 40 created by humans—were submitted for peer review at a workshop during a top-tier conference on machine learning. The reviewers knew that some of the papers had been generated by AI, but they didn't know which ones. Around 70% of the submitted papers passed the first round.Two of the AI-generated papers didn't make the cut, but one did, meaning it met the required scientific standards. However, the latest version of the "AI scientist" still has its flaws, such as underdeveloped ideas, structural problems, and many types of hallucinations. The fact is, as its own evaluation system shows, the quality of the papers seems to be steadily increasing over time, meaning a future with virtual scientists doesn't seem so far off. AI could solve the problem of inefficiency in science. AI systems are tireless. They can read research articles in seconds, they don't complain about overtime, and they don't need to be paid. Or at least, they cost far less than a human researcher. This could mean more results in less time. And with that, a much more efficient process for scientific discovery. However, the question is where those discoveries will lead us. When a human behaviors research, the final product is the result of dozens, if not hundreds, of small decisions. No two researchers would ever approach a research topic in exactly the same way. These decisions, when made by AI, disappear behind what is considered a "superhuman" system—one that could be considered smarter, faster, and more objective than we are. The Risks of Virtual Researchers So, what would happen if we relied on AI instead of our own diverse minds? Anthropologist Lisa Messeri and neuroscientist Molly Crockett foresee what they call a “monoculture of science.” In agriculture, monoculture is the practice of growing only one type of crop—instead of several—over a period of time. This usually means higher profits. But, at the same time, it increases the risk of crops falling victim to pests and diseases. Something similar could happen when we let AI systems do science for us. The kind of research that AI initiates might be only that which is best suited to it, at the expense of projects that require more context and nuance: the “human touch.” This could not only reduce the scope of science but also pose a risk of systematic errors. Whose perception disappears once humans step out of the picture.
“The biggest risk is relying too heavily on AI-generated results. The key countermeasure is the human capacity for critical thinking,” explained Iryna Gurevych, Professor of Ubiquitous Knowledge Processing, to Science Media Center Germany.
Without critical thinking, we may objectively produce more, but end up understanding less.

