Brain scans promise to provide insights into the brain and thus to track down the neuronal causes of mental illnesses. Many smaller studies with a few dozen participants have shown associations between certain anatomical features in the brain and abnormal behavior. However, most of these results are not reproducible and are based more on random variations, an analysis of the data from almost 50,000 people now shows. Actual but weaker connections, on the other hand, are often overlooked in the small studies.
When it became possible to use magnetic resonance imaging (MRI) to observe the living brain, researchers hoped to soon be able to find and treat the cause of all neurological, psychiatric, and mental illnesses. They also believed they could find connections to our personality and our behavior. But although technical improvements have made MRI images more and more detailed and meaningful in recent decades, these hopes have not yet been fulfilled. While many studies have found significant associations between brain structure and behavior, their findings have not been replicated.
Large samples required
A team led by Scott Marek from the Washington University School of Medicine has now analyzed the problem with these so-called brain-wide association studies. The reason for this was Marek’s own association study, with which he actually wanted to find out how cognitive abilities are represented in the brain. “We ran our analysis on a sample of 1,000 children and found a significant correlation and thought, ‘Great!’ But then we thought, ‘Can we reproduce this with another thousand children?’” says Marek. It turned out that the results were not reproducible. “That just blew my mind because a sample of a thousand children should have been enough. We racked our brains and wondered what was actually going on here.”
Therefore, to find out how large a sample needs to be for a brain-wide association study to produce reliable results, Marek and his team analyzed the three largest publicly available datasets of MRI data, totaling nearly 50,000 participants: the UK Biobank (35,375 participants), the Adolescent Brain Cognitive Development Study (11,874 participants) and the Human Connectome Project (1,200 participants). From this, they drew samples of different sizes and examined them for correlations between brain characteristics and a range of demographic, cognitive, psychological and behavioral characteristics. With the help of new samples, they then tried to repeat the results they had found.
Small Studies: Strong but False Findings
“The average sample size of classic brain-wide association studies is only 25 participants,” the researchers report. When they carried out their analyzes with this sample size, they often found clear associations – as did the work already published on this topic – but they were unable to replicate the results with new samples. The reproducibility of the results only improved with samples of several thousand participants, although the observed effects were less strong than the randomly generated correlations in the smaller studies.
How strong a correlation is is given as an effect size on a scale from 0 to 1, where 0 stands for no correlation and 1 for a perfect correlation. Depending on the object of investigation, different effect sizes are considered strong. In neuroscience, an effect size of 0.2 is already considered strong. However, many published works indicate much greater effect strengths – according to Marek actually a signal that something cannot be right. “You can find effect sizes of 0.8 in the literature, but nothing in nature has an effect size of 0.8,” says Marek. “The correlation between height and weight is 0.4. The correlation between altitude and daytime temperature is 0.3. These are strong, obvious, easily measured correlations, and they’re nowhere near 0.8. So how do we get the idea that the correlation between two very complex things like brain function and depression is 0.8?”
future in cooperation
Using their large data sets, Marek and his colleagues found that the associations between brain structure and behavior that were actually reproducible had an average effect size of only 0.01. Such weak effects cannot be detected with small samples. “Our results reflect a systemic, structural problem in studies that aim to find correlations between two complex things like the brain and behavior,” says Marek’s colleague Nico Hosenbach. Because MRI studies are expensive — an hour in the MRI can cost $1,000 — most studies have vastly undersized samples, making them prone to uncovering randomly strong but spurious associations while missing genuine but weaker associations. The flood of unreliable studies would slow progress in understanding the brain, the researchers said.
Because no single research group has the time or money to scan thousands of participants for each study, Mark and his colleagues see the solution in creating large, publicly available datasets, similar to what is done with genomic data. “For genomic data, the US National Institutes of Health (NIH) funded large data collections and stipulated that the data must be made publicly available,” says Hosenbach. He also proposes a similar approach for imaging the brain. By sharing resources, scientists could create more reliable studies that can actually help to better understand mental illness and potentially find treatments. “This work represents an important turning point in the linking of brain activity and behavior, as it clearly defines not only the previous obstacles, but also the promising new paths forward,” said Lassenbach.
Source: Scott Marek (Washington University School of Medicine, St. Louis, USA) et al., Nature, doi: 10.1038/s41586-022-04492-9