In their comprehensive review of microbiota-gut-brain (MGB) axis research, Hooks et al. raise concerns about the belated adoption of appropriate methods for studying microbiota composition. Recommendations exist – but only rarely find their way into MGB studies. Here, we point out current efforts of standardization and innovation that improve microbiota interpretability and reproducibility and provide guidelines for their application in MGB research.
Microbiome data generation involves multiple decisions concerning sample collection and storage conditions, nucleic acid extraction protocol, sequencing techniques, and pre-processing that are important to generate high-quality data and reproducible results (Costea et al. Reference Costea, Zeller, Sunagawa, Pelletier, Alberti, Levenez, Tramontano, Driessen, Hercog, Jung, Kultima, Hayward, Coelho, Allen-Vercoe, Bertrand, Blaut, Brown, Carton, Cools-Portier, Daigneault, Derrien, Druesne, De Vos, Finlay, Flint, Guarner, Hattori, Heilig, Luna, Van Hylckama Vlieg, Junick, Klymiuk, Langella, Le Chatelier, Mai, Manichanh, Martin, Mery, Morita, O'Toole, Orvain, Patil, Penders, Persson, Pons, Popova, Salonen, Saulnier, Scott, Singh, Slezak, Veiga, Versalovic, Zhao, Zoetendal, Ehrlich, Dore and Bork2017). Suggestions to optimize and standardize microbiome profiling have been published (Costea et al. Reference Costea, Zeller, Sunagawa, Pelletier, Alberti, Levenez, Tramontano, Driessen, Hercog, Jung, Kultima, Hayward, Coelho, Allen-Vercoe, Bertrand, Blaut, Brown, Carton, Cools-Portier, Daigneault, Derrien, Druesne, De Vos, Finlay, Flint, Guarner, Hattori, Heilig, Luna, Van Hylckama Vlieg, Junick, Klymiuk, Langella, Le Chatelier, Mai, Manichanh, Martin, Mery, Morita, O'Toole, Orvain, Patil, Penders, Persson, Pons, Popova, Salonen, Saulnier, Scott, Singh, Slezak, Veiga, Versalovic, Zhao, Zoetendal, Ehrlich, Dore and Bork2017; Sinha et al. Reference Sinha, Abu-Ali, Vogtmann, Fodor, Ren, Amir, Schwager, Crabtree, Ma, Abnet, Knight, White and Huttenhower2017; Valles-Colomer et al. Reference Valles-Colomer, Darzi, Vieira-Silva, Falony, Raes and Joossens2016; Vandeputte et al. Reference Vandeputte, Tito, Vanleeuwen, Falony and Raes2017c), and, although no complete consensus is achieved, adhering to and following up on such guidelines and being aware of limitations when comparing studies are crucial. Data analysis techniques are also continuously updated, and microbiome data analysis is no exception. A pitfall of microbiome data, ignored or underestimated until recently, is that the data are compositional. That is to say that abundances of microbial groups are expressed as proportions (of reads mapping that group in relation to the total sequenced library). The application of naive normalization and statistics to such compositional data can lead to erroneous results (Vandeputte et al. Reference Vandeputte, Kathagen, D'Hoe, Vieira-Silva, Valles-Colomer, Sabino, Wang, Tito, De Commer, Darzi, Vermeire, Falony and Raes2017b). Compositionality-robust statistics were therefore recently introduced in microbiota research (Gloor et al. Reference Gloor, Macklaim, Pawlowsky-Glahn and Egozcue2017) and have become the new standard in the field and implemented in the most popular pipelines for microbiome data analysis (Bolyen et al. Reference Bolyen, Rideout, Dillon, Bokulich, Abnet, Ghalith, Alexander, Alm, Arumugam, Bai, Bisanz, Bittinger, Brejnrod, Brislawn, Brown, Callahan, Mauricio, Rodríguez, Chase, Cope, Da Silva, Dorrestein, Douglas, Duvallet, Edwardson, Ernst, Fouquier, Gauglitz, Gibson, Gonzalez, Huttley, Janssen, Jarmusch, Kaehler, Kang, Keefe, Keim, Kelley, Ley, Loftfield, Marotz, Martin, Mcdonald, McIver, Melnik, Metcalf, Morgan, Morton, Naimey, Navas-Molina, Nothias, Orchanian, Pearson, Peoples, Petras, Preuss, Pruesse, Rasmussen, Rivers, Robeson, Rosenthal, Segata, Shaffer, Shiffer, Sinha, Song, Spear, Swafford, Thompson, Torres, Trinh, Tripathi, Turnbaugh, Ul-Hasan, van der Hooft, Vargas, Vázquez-Baeza, Vogtmann, von Hippel, Walters, Wan, Wang, Warren, Weber, Williamson, Willis, Xu, Zaneveld, Zhang, Knight and Caporaso2018).
Still, when interpreting variation in microbiota composition determined by metagenomic approaches, it is important to keep in mind that the information about the microbial densities in the original sample is lost. Without microbial load information, proportional data do not allow us to draw any conclusions regarding directionality of changes. For example, an increase in relative abundance of a single microorganism could just result from it maintaining its initial numbers in a generally decreasing community (Fig. 1). Recent innovations such as quantitative microbiome profiling (QMP) (Vandeputte et al. Reference Vandeputte, Kathagen, D'Hoe, Vieira-Silva, Valles-Colomer, Sabino, Wang, Tito, De Commer, Darzi, Vermeire, Falony and Raes2017b) tackle this issue by coupling flow cytometry cell count determination with sequencing, allowing us to recreate absolute abundance profiles from proportional sequence data (Fig. 1). Besides reducing the number of false positives detected in disease association studies, the method also facilitates relating microbial absolute abundances to quantitative physiological parameters. Determination of microbial loads showed that cell densities vary greatly even in healthy subjects but are generally reduced in patients with inflammatory bowel disease. Hence, reduced microbial density could be part of a microbiota signature of disease.
Figure 1. Implications of the compositionality of microbiota data. Three illustrative samples (top) – one from a control (control) and two from patients (case 1 and case 2) – each containing four different microbial taxa, are analyzed by relative (bottom left) or quantitative (bottom right) microbiome profiling. In the original sample, although case 1 has an increased absolute abundance of taxon A, case 2 has decreased abundances of taxa B and C. Relative microbiome profiling results in very similar profiles for the two cases, and alongside the true differences in taxa abundances (true positives: ✓), additional apparent differences are also detected (false positives: ×). In addition, assuming even sequencing depth, samples with reduced microbial density (case 2) are more deeply sampled than the high abundance counterpart (case 1), leading to the detection of taxon D in case 2. In contrast, with quantitative microbiome profiling (coupling DNA sequencing [light blue] with cell count determination by flow cytometry [yellow]), the original absolute abundances of microbial taxa are recreated (although subsampled), and therefore, the information on directionality of the changes is recovered.
Other variables can influence microbiota composition in an MG study besides the disease phenotype under investigation. In addition to confounders that are typically already taken into account in clinical studies (e.g., gender and age), microbiota-relevant factors should also be addressed in study design (e.g., by matching cases and controls) or by recording and factoring them in the statistical analyses. Gastrointestinal transit time, medication, diet, and inflammation markers should be highlighted (Falony et al. Reference Falony, Joossens, Vieira-Silva, Wang, Darzi, Faust, Kurilshikov, Bonder, Valles-Colomer, Vandeputte, Tito, Chaffron, Rymenans, Verspecht, De Sutter, Lima-Mendez, D'Hoe, Jonckheere, Homola, Garcia, Tigchelaar, Eeckhaudt, Fu, Henckaerts, Zhernakova, Wijmenga and Raes2016; Vandeputte et al. Reference Vandeputte, Falony, Vieira-Silva, Tito, Joossens and Raes2015). Transit time is linked to changes in total microbial loads and in abundance of specific taxa, as the microbial ecosystem goes through different stages of development as it progresses through and remains in the intestinal tract (Falony et al. Reference Falony, Vieira-Silva and Raes2018). Beyond the normal variation observed in healthy individuals, altered transit time is also characteristic of several diseases, including nervous system diseases, either being accelerated (e.g., anxiety disorders; Gorard et al. Reference Gorard, Gomborone, Libby and Farthing1996) or slowed down (e.g., Parkinson's disease; Knudsen et al. Reference Knudsen, Haase, Fedorova and Charlotte2017). Therefore, to capture the disease, but not transit time-associated microbiota variations, gastrointestinal transit time needs to be tracked in MGB studies, either by measurement (magnetic tracking systems) or by using proxies such as the Bristol stool scale (Lewis & Heaton Reference Lewis and Heaton1997) or stool moisture content (Vandeputte et al. Reference Vandeputte, Kathagen, D'Hoe, Vieira-Silva, Valles-Colomer, Sabino, Wang, Tito, De Commer, Darzi, Vermeire, Falony and Raes2017b). Additional important confounders in the MGB context include medications, several of which have been reported to affect microbiota composition, including psychotropic drugs (Cussotto et al. Reference Cussotto, Strain, Fouhy, Strain, Peterson, Clarke, Stanton, Dinan and Cryan2018). Effects of drugs on the microbiota can be direct, by affecting growth of specific microorganisms, or indirect, by inducing variations in transit time or host physiology (Forslund et al. Reference Forslund, Hildebrand, Nielsen, Falony, Le Chatelier, Sunagawa, Sunagawa, Prifti, Vieira-Silva, Gudmundsdottir, Krogh Pedersen, Arumugam, Kristiansen, Voigt, Vestergaard, Hercog, Costea, Kultima, Li, Jorgensen, Levenez, Dore, Nielsen, Brunak, Raes, Hansen, Wang, Dusko, Bork and Pedersen2015; Maier et al. Reference Maier, Pruteanu, Kuhn, Zeller, Telzerow, Anderson, Brochado, Fernandez, Dose, Mori, Patil, Bork and Typas2018), but in any case need to be disentangled from the disease signal. Diet can also be an important confounder (David et al. Reference David, Maurice, Carmody, Gootenberg, Button, Wolfe, Ling, Devlin, Varma, Fischbach, Biddinger, Dutton and Turnbaugh2014), especially if dietary behavioral changes are associated with the disease. Finally, inflammation has an impact on the microbiota (Cenit et al. Reference Cenit, Sanz and Codoñer-Franch2017) and may not be part of the disease manifestation. Although we acknowledge the challenges of assessing dietary intake in a systematic way or controlling for it in study design, both systemic inflammation markers (e.g., C-reactive protein) and specific markers for intestinal inflammation (fecal calprotectin) measurements are straightforward.
Finally, the “causality problem” highlighted by Hooks et al. can only be tackled in study design. Strategies such as transplanting/deleting microbiota components associated with the disease to induce/reverse phenotypes in model organisms can provide valuable insights. However, it remains difficult to disentangle direct and indirect contributions of the microbiota in disease onset or pathophysiology. One way to acquire more information on potential mechanisms underlying microbiota-host associations is assessing the metabolic potential of the microbial communities under study, which requires meta-genomic shotgun sequencing. Although computationally more challenging and only rarely performed in MGB studies, such data are very valuable to study the most direct of the proposed microbiota-driven route of MGB communication: the microbial synthesis and degradation of neuroactive compounds (Lyte & Cryan Reference Lyte and Cryan2014). Future research will become easier as context-specific tools are developed, such as the recent publication of neuroactive compound metabolism of the human gut microbiota (Valles-Colomer et al. Reference Valles-Colomer, Falony, Darzi, Tigchelaar, Wang, Tito, Schiweck, Kurilshikov, Joossens, Wijmenga, Claes, Van Oudenhove, Zhernakova, Vieira-Silva and Raes2019), which provides a catalog to facilitate future MGB shotgun meta-genomic analyses.
The MGB field is at an exciting and promising stage. An early adoption of the latest advances in microbiome research by the MGB community, with careful study design, appropriate analysis techniques, and taking into consideration known potential confounders, will promote reliable discovery and lead to earlier translation to clinical application.
In their comprehensive review of microbiota-gut-brain (MGB) axis research, Hooks et al. raise concerns about the belated adoption of appropriate methods for studying microbiota composition. Recommendations exist – but only rarely find their way into MGB studies. Here, we point out current efforts of standardization and innovation that improve microbiota interpretability and reproducibility and provide guidelines for their application in MGB research.
Microbiome data generation involves multiple decisions concerning sample collection and storage conditions, nucleic acid extraction protocol, sequencing techniques, and pre-processing that are important to generate high-quality data and reproducible results (Costea et al. Reference Costea, Zeller, Sunagawa, Pelletier, Alberti, Levenez, Tramontano, Driessen, Hercog, Jung, Kultima, Hayward, Coelho, Allen-Vercoe, Bertrand, Blaut, Brown, Carton, Cools-Portier, Daigneault, Derrien, Druesne, De Vos, Finlay, Flint, Guarner, Hattori, Heilig, Luna, Van Hylckama Vlieg, Junick, Klymiuk, Langella, Le Chatelier, Mai, Manichanh, Martin, Mery, Morita, O'Toole, Orvain, Patil, Penders, Persson, Pons, Popova, Salonen, Saulnier, Scott, Singh, Slezak, Veiga, Versalovic, Zhao, Zoetendal, Ehrlich, Dore and Bork2017). Suggestions to optimize and standardize microbiome profiling have been published (Costea et al. Reference Costea, Zeller, Sunagawa, Pelletier, Alberti, Levenez, Tramontano, Driessen, Hercog, Jung, Kultima, Hayward, Coelho, Allen-Vercoe, Bertrand, Blaut, Brown, Carton, Cools-Portier, Daigneault, Derrien, Druesne, De Vos, Finlay, Flint, Guarner, Hattori, Heilig, Luna, Van Hylckama Vlieg, Junick, Klymiuk, Langella, Le Chatelier, Mai, Manichanh, Martin, Mery, Morita, O'Toole, Orvain, Patil, Penders, Persson, Pons, Popova, Salonen, Saulnier, Scott, Singh, Slezak, Veiga, Versalovic, Zhao, Zoetendal, Ehrlich, Dore and Bork2017; Sinha et al. Reference Sinha, Abu-Ali, Vogtmann, Fodor, Ren, Amir, Schwager, Crabtree, Ma, Abnet, Knight, White and Huttenhower2017; Valles-Colomer et al. Reference Valles-Colomer, Darzi, Vieira-Silva, Falony, Raes and Joossens2016; Vandeputte et al. Reference Vandeputte, Tito, Vanleeuwen, Falony and Raes2017c), and, although no complete consensus is achieved, adhering to and following up on such guidelines and being aware of limitations when comparing studies are crucial. Data analysis techniques are also continuously updated, and microbiome data analysis is no exception. A pitfall of microbiome data, ignored or underestimated until recently, is that the data are compositional. That is to say that abundances of microbial groups are expressed as proportions (of reads mapping that group in relation to the total sequenced library). The application of naive normalization and statistics to such compositional data can lead to erroneous results (Vandeputte et al. Reference Vandeputte, Kathagen, D'Hoe, Vieira-Silva, Valles-Colomer, Sabino, Wang, Tito, De Commer, Darzi, Vermeire, Falony and Raes2017b). Compositionality-robust statistics were therefore recently introduced in microbiota research (Gloor et al. Reference Gloor, Macklaim, Pawlowsky-Glahn and Egozcue2017) and have become the new standard in the field and implemented in the most popular pipelines for microbiome data analysis (Bolyen et al. Reference Bolyen, Rideout, Dillon, Bokulich, Abnet, Ghalith, Alexander, Alm, Arumugam, Bai, Bisanz, Bittinger, Brejnrod, Brislawn, Brown, Callahan, Mauricio, Rodríguez, Chase, Cope, Da Silva, Dorrestein, Douglas, Duvallet, Edwardson, Ernst, Fouquier, Gauglitz, Gibson, Gonzalez, Huttley, Janssen, Jarmusch, Kaehler, Kang, Keefe, Keim, Kelley, Ley, Loftfield, Marotz, Martin, Mcdonald, McIver, Melnik, Metcalf, Morgan, Morton, Naimey, Navas-Molina, Nothias, Orchanian, Pearson, Peoples, Petras, Preuss, Pruesse, Rasmussen, Rivers, Robeson, Rosenthal, Segata, Shaffer, Shiffer, Sinha, Song, Spear, Swafford, Thompson, Torres, Trinh, Tripathi, Turnbaugh, Ul-Hasan, van der Hooft, Vargas, Vázquez-Baeza, Vogtmann, von Hippel, Walters, Wan, Wang, Warren, Weber, Williamson, Willis, Xu, Zaneveld, Zhang, Knight and Caporaso2018).
Still, when interpreting variation in microbiota composition determined by metagenomic approaches, it is important to keep in mind that the information about the microbial densities in the original sample is lost. Without microbial load information, proportional data do not allow us to draw any conclusions regarding directionality of changes. For example, an increase in relative abundance of a single microorganism could just result from it maintaining its initial numbers in a generally decreasing community (Fig. 1). Recent innovations such as quantitative microbiome profiling (QMP) (Vandeputte et al. Reference Vandeputte, Kathagen, D'Hoe, Vieira-Silva, Valles-Colomer, Sabino, Wang, Tito, De Commer, Darzi, Vermeire, Falony and Raes2017b) tackle this issue by coupling flow cytometry cell count determination with sequencing, allowing us to recreate absolute abundance profiles from proportional sequence data (Fig. 1). Besides reducing the number of false positives detected in disease association studies, the method also facilitates relating microbial absolute abundances to quantitative physiological parameters. Determination of microbial loads showed that cell densities vary greatly even in healthy subjects but are generally reduced in patients with inflammatory bowel disease. Hence, reduced microbial density could be part of a microbiota signature of disease.
Figure 1. Implications of the compositionality of microbiota data. Three illustrative samples (top) – one from a control (control) and two from patients (case 1 and case 2) – each containing four different microbial taxa, are analyzed by relative (bottom left) or quantitative (bottom right) microbiome profiling. In the original sample, although case 1 has an increased absolute abundance of taxon A, case 2 has decreased abundances of taxa B and C. Relative microbiome profiling results in very similar profiles for the two cases, and alongside the true differences in taxa abundances (true positives: ✓), additional apparent differences are also detected (false positives: ×). In addition, assuming even sequencing depth, samples with reduced microbial density (case 2) are more deeply sampled than the high abundance counterpart (case 1), leading to the detection of taxon D in case 2. In contrast, with quantitative microbiome profiling (coupling DNA sequencing [light blue] with cell count determination by flow cytometry [yellow]), the original absolute abundances of microbial taxa are recreated (although subsampled), and therefore, the information on directionality of the changes is recovered.
Other variables can influence microbiota composition in an MG study besides the disease phenotype under investigation. In addition to confounders that are typically already taken into account in clinical studies (e.g., gender and age), microbiota-relevant factors should also be addressed in study design (e.g., by matching cases and controls) or by recording and factoring them in the statistical analyses. Gastrointestinal transit time, medication, diet, and inflammation markers should be highlighted (Falony et al. Reference Falony, Joossens, Vieira-Silva, Wang, Darzi, Faust, Kurilshikov, Bonder, Valles-Colomer, Vandeputte, Tito, Chaffron, Rymenans, Verspecht, De Sutter, Lima-Mendez, D'Hoe, Jonckheere, Homola, Garcia, Tigchelaar, Eeckhaudt, Fu, Henckaerts, Zhernakova, Wijmenga and Raes2016; Vandeputte et al. Reference Vandeputte, Falony, Vieira-Silva, Tito, Joossens and Raes2015). Transit time is linked to changes in total microbial loads and in abundance of specific taxa, as the microbial ecosystem goes through different stages of development as it progresses through and remains in the intestinal tract (Falony et al. Reference Falony, Vieira-Silva and Raes2018). Beyond the normal variation observed in healthy individuals, altered transit time is also characteristic of several diseases, including nervous system diseases, either being accelerated (e.g., anxiety disorders; Gorard et al. Reference Gorard, Gomborone, Libby and Farthing1996) or slowed down (e.g., Parkinson's disease; Knudsen et al. Reference Knudsen, Haase, Fedorova and Charlotte2017). Therefore, to capture the disease, but not transit time-associated microbiota variations, gastrointestinal transit time needs to be tracked in MGB studies, either by measurement (magnetic tracking systems) or by using proxies such as the Bristol stool scale (Lewis & Heaton Reference Lewis and Heaton1997) or stool moisture content (Vandeputte et al. Reference Vandeputte, Kathagen, D'Hoe, Vieira-Silva, Valles-Colomer, Sabino, Wang, Tito, De Commer, Darzi, Vermeire, Falony and Raes2017b). Additional important confounders in the MGB context include medications, several of which have been reported to affect microbiota composition, including psychotropic drugs (Cussotto et al. Reference Cussotto, Strain, Fouhy, Strain, Peterson, Clarke, Stanton, Dinan and Cryan2018). Effects of drugs on the microbiota can be direct, by affecting growth of specific microorganisms, or indirect, by inducing variations in transit time or host physiology (Forslund et al. Reference Forslund, Hildebrand, Nielsen, Falony, Le Chatelier, Sunagawa, Sunagawa, Prifti, Vieira-Silva, Gudmundsdottir, Krogh Pedersen, Arumugam, Kristiansen, Voigt, Vestergaard, Hercog, Costea, Kultima, Li, Jorgensen, Levenez, Dore, Nielsen, Brunak, Raes, Hansen, Wang, Dusko, Bork and Pedersen2015; Maier et al. Reference Maier, Pruteanu, Kuhn, Zeller, Telzerow, Anderson, Brochado, Fernandez, Dose, Mori, Patil, Bork and Typas2018), but in any case need to be disentangled from the disease signal. Diet can also be an important confounder (David et al. Reference David, Maurice, Carmody, Gootenberg, Button, Wolfe, Ling, Devlin, Varma, Fischbach, Biddinger, Dutton and Turnbaugh2014), especially if dietary behavioral changes are associated with the disease. Finally, inflammation has an impact on the microbiota (Cenit et al. Reference Cenit, Sanz and Codoñer-Franch2017) and may not be part of the disease manifestation. Although we acknowledge the challenges of assessing dietary intake in a systematic way or controlling for it in study design, both systemic inflammation markers (e.g., C-reactive protein) and specific markers for intestinal inflammation (fecal calprotectin) measurements are straightforward.
Finally, the “causality problem” highlighted by Hooks et al. can only be tackled in study design. Strategies such as transplanting/deleting microbiota components associated with the disease to induce/reverse phenotypes in model organisms can provide valuable insights. However, it remains difficult to disentangle direct and indirect contributions of the microbiota in disease onset or pathophysiology. One way to acquire more information on potential mechanisms underlying microbiota-host associations is assessing the metabolic potential of the microbial communities under study, which requires meta-genomic shotgun sequencing. Although computationally more challenging and only rarely performed in MGB studies, such data are very valuable to study the most direct of the proposed microbiota-driven route of MGB communication: the microbial synthesis and degradation of neuroactive compounds (Lyte & Cryan Reference Lyte and Cryan2014). Future research will become easier as context-specific tools are developed, such as the recent publication of neuroactive compound metabolism of the human gut microbiota (Valles-Colomer et al. Reference Valles-Colomer, Falony, Darzi, Tigchelaar, Wang, Tito, Schiweck, Kurilshikov, Joossens, Wijmenga, Claes, Van Oudenhove, Zhernakova, Vieira-Silva and Raes2019), which provides a catalog to facilitate future MGB shotgun meta-genomic analyses.
The MGB field is at an exciting and promising stage. An early adoption of the latest advances in microbiome research by the MGB community, with careful study design, appropriate analysis techniques, and taking into consideration known potential confounders, will promote reliable discovery and lead to earlier translation to clinical application.