Molecular Psychiatry (2023)Cite this article



Over the past few decades, neuroimaging research in Bipolar Disorder (BD) has identified neural differences underlying cognitive and emotional processing. However, substantial clinical and methodological heterogeneity present across neuroimaging experiments potentially hinders the identification of consistent neural biomarkers of BD. This meta-analysis aims to comprehensively reassess brain activation and connectivity in BD in order to identify replicable differences that converge across and within resting-state, cognitive, and emotional neuroimaging experiments.


Neuroimaging experiments (using fMRI, PET, or arterial spin labeling) reporting whole-brain results in adults with BD and controls published from December 1999—June 18, 2019 were identified via PubMed search. Coordinates showing significant activation and/or connectivity differences between BD participants and controls during resting-state, emotional, or cognitive tasks were extracted. Four parallel, independent meta-analyses were calculated using the revised activation likelihood estimation algorithm: all experiment types, all resting-state experiments, all cognitive experiments, and all emotional experiments. To confirm reliability of identified clusters, two different meta-analytic significance tests were employed.


205 published studies yielding 506 individual neuroimaging experiments (150 resting-state, 134 cognitive, 222 emotional) comprising 5745 BD and 8023 control participants were included. Five regions survived both significance tests. Individuals with BD showed functional differences in the right posterior cingulate cortex during resting-state experiments, the left amygdala during emotional experiments, including those using a mixed (positive/negative) valence manipulation, and the left superior and right inferior parietal lobules during cognitive experiments, while hyperactivating the left medial orbitofrontal cortex during cognitive experiments. Across all experiments, there was convergence in the right caudate extending to the ventral striatum, surviving only one significance test.


Our findings indicate reproducible localization of prefrontal, parietal, and limbic differences distinguishing BD from control participants that are condition-dependent, despite heterogeneity, and point towards a framework for identifying reproducible differences in BD that may guide diagnosis and treatment.


Bipolar Disorder (BD) is a common, debilitating psychiatric disorder resulting in disease burden worldwide [1]. The past several decades of neuroimaging research have investigated the neural substrates of mechanisms underlying differences in cognitive and emotional processing that are characteristic of BD [23] which has enabled the conceptualization of neural models [24] that are critical to understanding it. The study of neural differences in BD associated with both brain activation (i.e., regional BOLD signaling) and functional connectivity (i.e., the correlation between different brain regions that can elucidate the nature of neural network dynamics) [5] enables the identification of biomarkers that improve diagnostic precision, facilitate early identification, and inform targets for treatment developments [6]. However, the presence of clinical heterogeneity [78] (e.g., differences in healthcare systems [910], diagnostic subtypes [1112], mood state [1314], treatment response [715], comorbidity [7], chronicity, severity [16]), methodological differences (imaging modality, paradigm), and analytical flexibility [1718], as well as the impact of physiological noise sources [19,20,21] and variability of neural responses to cognitive manipulations [22,23,24,25] may all hinder the identification of consistent neural biomarkers of Bipolar-related illness [7122627].

While high-powered structural studies [28] and qualitative reviews [24] are informative to the development of theoretical models of BD, coordinate-based meta-analysis techniques, such as activation likelihood estimation (ALE) [2930], can test meta-analytic hypotheses at the level of the whole brain in a spatially unbiased fashion [31], taking into account hundreds to thousands of participants and disparities in experimental design decisions [29]. Moreover, depending on how the hypothesis is constructed, successful refutation of the null hypothesis can provide preliminary evidence for potential reproducible differences distinguishing individuals with BD from control participants [32]. However, the extent to which this is possible depends on the quality and number of studies included.

It is widely known that functional neuroimaging studies are hampered by great heterogeneity and low power due to small sample sizes, leading to the use of lower, often uncorrected, thresholds to obtain positive results, and thus a substantial risk of frequent false positive findings [33]. It is thus important to acknowledge that the neuroimaging literature on BD is likely to include numerous under-powered studies using phenotypically heterogenous samples and disproportionately characterized by positive exploratory findings rather than evaluation of the magnitude of a priori hypothesized effects [4]. Nevertheless, meta-analyses are needed to reconcile the literature’s pitfalls and provide a framework to test whether the findings of small, heterogenous studies can be reproducible across different studies [31]. Meta-analyses can be used to synthesize results of individual studies in spite of heterogeneity, thereby allowing readers to draw wider conclusions about the state of the literature at large (including whether any reported effects are reproducible). They also highlight irregularities and issues present in the field which, importantly, provides transparency that can guide future study designs and encourage replications [31]. Given the rapid rate at which neuroimaging studies of BD are being conducted and published, meta-analyses are useful in that they comprehensively, quantitatively summarize and integrate disparate findings, building cumulative knowledge and guiding future work [33]. ALE is also statistically conservative [34], using cluster-level family-wise error correction which leads to a low likelihood of false positive convergence, especially if a significant region includes contributing foci from several studies rather than a disproportionate contribution from a single study [3135]. Individual studies reporting results at uncorrected thresholds may be used with ALE, given that uncorrected thresholds can provide a favorable balance between false positives and false negatives [36].

Previous coordinate-based meta-analyses have found correlates of BD across emotion-processing experiments distinguishing BD from both non-clinical [37] and clinical controls, such as unipolar depression [38] and schizophrenia [39], across resting-state experiments [4041], and across both cognitive and emotional experiments [42]. However, these meta-analyses had a narrower focus and were limited by the available data which often had smaller sample sizes. Additionally, they did not incorporate techniques that have been used more recently in psychiatric neuroimaging research (e.g., Amplitude of Low Frequency Fluctuations (ALFF) [4344], Independent Component Analysis (ICA) [45], Regional Homogeneity (ReHo) [46], degree centrality (DC) [47], functional connectivity strength (FCS) [47]). Furthermore, despite there being extensive neurocognitive differences in BD [48,49,50,51], there are no ALE meta-analyses of BD solely examining cognitive experiments.

Notwithstanding these gaps and advancements, no meta-analyses have examined the effect of condition (i.e., changes in neural activity and connectivity in response to changing task requirements and/or the level of arousal) via testing for a potential invariant condition-independent, or condition-dependent (i.e., clusters that converge across experiments or paradigms of one type, but not across experiments of a different type) functional marker of BD. Given the extensive evidence showing functional and structural differences in limbic regions, particularly the amygdala [4], across different mood states and neuroimaging modalities (e.g., structural magnetic resonance imaging (MRI), diffusion tensor imaging, resting-state, emotional and cognitive paradigms) [4], there are empirical grounds for suggesting the existence of a condition-independent marker of BD. Such a marker could manifest across a variety of cognitive (e.g., working memory), and emotional paradigms, and contribute to the differences in these processes distinguishing BD [51,52,53,54,55,56,57]. Alternatively, functional neural differences in BD may be more selective, with distinct markers being observed across different paradigm types.

Thus, the objective of this investigation was to comprehensively reassess brain activation and functional connectivity in BD in order to identify a reproducible, condition-independent neural correlate of BD that converges across resting-state, cognitive, and emotional experiments combined. To our knowledge, this study is the largest, high-powered, most comprehensive meta-analysis of BD functional neuroimaging experiments to date, which is necessary to justify whether the current state of the literature allows for identification of replicable differences in BD despite significant heterogeneity and the mixed reliability of task-based functional MRI (fMRI) for brain biomarker discovery [5859]. An omnibus meta-analysis across all experiment types tested the hypothesis that there would be a condition-independent neural marker differentiating BD from controls that can be seen regardless of task type, akin to a simple localized deficit which is generalizable across paradigms and modalities [28]. Parallel independent, individual meta-analyses of resting-state, cognitive, and emotional processing experiments each tested for condition-dependent neural signatures of BD. Meta-analysis across resting-state experiments tested for a potential core functional difference [2] that is reliable [4360] and relatively unconfounded by task effects compared to task-based fMRI [61]. Across cognitive tasks using non-affective stimuli, we hypothesized that there would be a functional difference related to cognition given that individuals with BD have impaired executive functioning, sustained attention, working and verbal memory [348505157], and this cognitive signature would significantly differ from both the null distribution as well as resting-state and emotional experiments. We further tested whether any significant cognitive differences were hyper/hypoactive in participants with BD. Furthermore, we hypothesized that there would be an emotion-related difference specific to emotion processing-related paradigms, given behavioral differences associated with mood lability and emotion dysregulation in BD [25253]. We also tested whether this signature was specific to valence via post-hoc subgroup meta-analyses of experiments associated with negative, positive and mixed (negative and positive) valence, and whether any significant emotion-processing differences were hyper/hypoactive in participants with BD. Finally, we examined the contribution of clinical confounds and nesting (i.e., a single study contributing more than one experiment or contrast to a cluster) using two different meta-analytic significance tests to confirm the reliability of meta-analytic findings.


Search and selection

Figure 1 depicts the study selection process and reasons for exclusion. Details on eligibility criteria and literature search terms can be found in the Supplementary Methods. In brief, resting-state and task-based (cognitive and emotional) functional neuroimaging experiments using fMRI, positron emission tomography (PET), or arterial spin labeling (ASL) published online from December 1, 1999 through July 18, 2019 were identified from a systematic PubMed search. To be eligible for inclusion, experiments had to report voxelwise whole-brain results via standard whole-brain analyses, seed-to-voxel functional connectivity (including psychophysiological interactions (PPIs), granger causality mapping (GCM), and beta-series correlation for task-based experiments), ICA, ReHo, ALFF/fractional ALFF (fALFF), voxel-mirrored homotopic connectivity (VMHC), FCS, DC, eigenvector centrality mapping (ECM), in standard stereotaxic space (Montreal Neuroimaging Institute (MNI) or Talairach) that statistically compared adults (≥16 years old) diagnosed with BD to an adult non-BD control group (non-clinical and/or clinical). The minimum age was 16 and the mean age was over 18. While the majority of included studies measured adults aged 18 or older, a minority of studies included participants aged 16. These studies were included so as to increase the number of relevant studies in the meta-analysis and be as inclusive as possible. Pediatric (<16 years of age) and at-risk cases of BD were excluded to mitigate variability in neural activations that might be secondary to developing sex hormone effects [62,63,64,65,66].

Fig. 1: Flowchart of study selection.
figure 1

ASL Arterial Spin Labeling, BD Bipolar Disorder, FDG Fluorodeoxyglucose, fMRI Functional Magnetic Resonance Imaging, PET Positron Emission Tomography, ROI Region Of Interest.

Full size image

Cognitive experiments were operationalized as tasks using a cognitive paradigm with non-affective stimuli, and contrasts comparing a cognitive challenge to either a less-challenging control (e.g., 3-back vs. 1-back) or a baseline condition (e.g., 0-back, rest) were both included.

Emotional experiments were operationalized as tasks presenting an emotional visual, auditory, or sensory (e.g., pain, odor) stimulus or invoking an emotion (e.g., sad mood induction). Contrasts comparing an emotional condition to either a non-emotional/neutral condition, resting/baseline condition, or other emotional condition were all included. Compound emotional/cognitive tasks, operationalized as cognitive paradigms with an emotional manipulation (e.g., go/no-go with emotional distractors), were also included. Emotional tasks were then further separated into valence classes for post-hoc meta-analyses: Negative valence was operationalized as stimuli representing or invoking fear, sadness, anger, disgust, pain, loss, or punishment; positive valence was operationalized as stimuli representing or invoking happiness, pleasure, or rewards; mixed valence was operationalized as positive and negative valence stimuli/conditions collapsed (e.g., all emotional faces vs. baseline); neutral was defined as a non-emotional condition or stimulus (e.g., blank face, shape).