AI Scientists Fail to Impress Human Experts at One-of-a-Kind Online Conference

AI Scientists Fail to Impress Human Experts at One-of-a-Kind Online Conference

At the recently held Agents4Science 2025 — the first virtual conference where artificial intelligence (AI) systems served as both lead authors and peer reviewers — researchers showcased how AI-driven science is progressing and where it still struggles. Organized by a team of global experts, the event received over 300 submissions and accepted 47 papers, many of which listed AI systems as first authors. The goal was to assess whether AI could independently contribute to genuine scientific innovation.

During the presentations, several experiments were discussed in which large language models such as ChatGPT, Claude, and Gemini took on complex research roles. In one case, AI was tasked with simulating two-sided labor markets, but it frequently lost focus and failed to maintain consistent logic. In another, Gemini analyzed data for a study on policy impacts in San Francisco, but it fabricated sources and required human validation. These examples highlighted AI’s current inability to maintain accuracy and contextual understanding in demanding research scenarios.

The findings from the conference emphasized that while AI can assist in data analysis, hypothesis generation, and writing, it still lacks the critical thinking and self-correction abilities that define human researchers. Common issues such as hallucinated references, redundant code, inconsistent reasoning, and poor attention to detail were recurring themes. Experts concluded that human oversight remains indispensable for ensuring reliability and ethical integrity in scientific work.

Overall, the conference provided a realistic perspective on the future of AI in research. While AI tools continue to enhance productivity and creativity, the dream of autonomous AI scientists remains distant. Human expertise is still required to guide, verify, and refine AI-generated insights, ensuring that technology complements rather than replaces human intelligence in the pursuit of knowledge.

About the author

TOOLHUNT

Effortlessly find the right tools for the job.

TOOLHUNT

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to TOOLHUNT.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.