FOOBOO2E5Re3_new

AI Outperforms Humans in Predicting Neuroscience Study Outcomes, Reveals UCL Research

A groundbreaking study led by the University of College London (UCL) has found that large language models (LLMs), a type of artificial intelligence (AI) that analyses text, are more adept at predicting the results of proposed neuroscience research than human professionals are.

Published in Nature Human Behaviour, the study underscores the impressive capabilities of LLMs, which are trained on vast text databases. These AI models can identify patterns from scientific papers, enabling them to anticipate the outcomes of scientific research with an astonishing degree of accuracy. The researchers believe this discovery could significantly accelerate research progress, extending far beyond the current use of AI for information retrieval.

Dr Ken Luo, the study’s lead author and a member of the UCL Psychology & Language Sciences department, explained that while LLMs are known for their ability to extract information from massive volumes of data, their potential to predict future scientific outcomes has been less explored. He noted that research progress often involves trial and error, with even highly skilled researchers occasionally missing valuable insights. This study investigates whether LLMs could facilitate this process by gleaning patterns from vast scientific texts to predict experimental outcomes.

To test this, the research team created BrainBench, a tool that measures how well LLMs can forecast neuroscience results. This tool comprises numerous pairs of neuroscience research abstracts, one real and one modified with plausible but incorrect results. They tested 15 different general-purpose LLMs and 171 neuroscience experts, with the task of identifying which abstract contained the real results.

In this contest, all the LLMs outperformed the human experts, scoring an average accuracy of 81% compared to the humans’ 63%. Even when only the most experienced humans were considered, their accuracy still fell short at 66%. The study also found that the more confident the AI was, the more likely it was to be correct.

The researchers then trained an existing LLM, Mistral, specifically on neuroscience literature, creating a new, more specialised model that they named BrainGPT. This neuroscience-focused LLM performed even better, achieving an accuracy of 86%.

Senior author Professor Bradley Love of UCL believes that the study points towards a future where scientists will leverage AI tools to design more effective experiments. He also notes that the ability of LLMs to predict results suggests that much of scientific research is not truly novel, but instead follows existing patterns.

The team is now developing AI tools to assist researchers, envisioning a future where scientists can input their proposed experiments and expected findings into an AI, which would then offer predictions on the likelihood of various outcomes. This could lead to faster iterations and more informed decision-making in experiment design.

This study was made possible through the support of the Economic and Social Research Council (ESRC), Microsoft, and a Royal Society Wolfson Fellowship. It involved collaborations between several institutions worldwide, including the University of Cambridge, the University of Oxford, and the Max Planck Institute for Neurobiology of Behavior in Germany.