Hostname: page-component-745bb68f8f-b6zl4 Total loading time: 0 Render date: 2025-02-11T01:34:37.427Z Has data issue: false hasContentIssue false

A primer to metabarcoding surveys of Antarctic terrestrial biodiversity

Published online by Cambridge University Press:  13 September 2016

Paul Czechowski*
Affiliation:
Australian Centre for Ancient DNA, University of Adelaide, Adelaide, SA 5000, Australia Antarctic Biological Research Initiative, 31 Jobson Road, Bolivar, SA 5110, Australia
Laurence J. Clarke
Affiliation:
Australian Centre for Ancient DNA, University of Adelaide, Adelaide, SA 5000, Australia Australian Antarctic Division, Channel Highway, Kingston, TAS 7050, Australia Antarctic Climate & Ecosystems Cooperative Research Centre, University of Tasmania, Private Bag 80, Hobart, TAS 7001, Australia
Alan Cooper
Affiliation:
Australian Centre for Ancient DNA, University of Adelaide, Adelaide, SA 5000, Australia
Mark I. Stevens
Affiliation:
South Australian Museum, GPO Box 234, Adelaide, SA 5000, Australia School of Pharmacy and Medical Sciences, University of South Australia, Adelaide, SA 5000, Australia
Rights & Permissions [Opens in a new window]

Abstract

Ice-free regions of Antarctica are concentrated along the coastal margins but are scarce throughout the continental interior. Environmental changes, including the introduction of non-indigenous species, increasingly threaten these unique habitats. At the same time, the unique biotic communities subsisting in isolation across the continent are difficult to survey due to logistical constraints, sampling challenges and problems related to the identification of small and cryptic taxa. Baseline biodiversity data from remote Antarctic habitats are still missing for many parts of the continent but are critical to the detection of community changes over time, including newly introduced species. Here we review the potential of standardized (non-specialist) sampling in the field (e.g. from soil, vegetation or water) combined with high-throughput sequencing (HTS) of bulk DNA as a possible solution to overcome some of these problems. In particular, HTS metabarcoding approaches benefit from being able to process many samples in parallel, while workflow and data structure can stay highly uniform. Such approaches have quickly gained recognition and we show that HTS metabarcoding surveys are likely to play an important role in continent-wide biomonitoring of all Antarctic terrestrial habitats.

Type
Synthesis
Copyright
© Antarctic Science Ltd 2016 

Introduction

Although only 0.3% of Continental Antarctica is ice-free, Antarctica is home to many organisms including bacteria, unicellular eukaryotes, fungi, lichen, cryptogamic plants and invertebrates that are scattered across the continent and subsist in isolated, remote, island-like habitats (Convey et al. Reference Convey, Chown, Clarke, Barnes, Bokhorst, Cummings, Ducklow, Frati, Green, Gordon, Griffiths, Howard-Williams, Huiskes, Laybourn-Parry, Lyons, McMinn, Morley, Peck, Quesada, Robinson, Schiaparelli and Wall2014), for example, in soils, lakes and cryoconite holes. Availability of biodiversity information from these Antarctic areas is required for three major reasons. First, such data facilitate the investigation of glacial constraints and effects on current biodiversity (Convey et al. Reference Convey, Stevens, Hodgson, Smellie, Hillenbrand, Barnes, Clarke, Pugh, Linse and Cary2009). Second, it allows investigation of the effects of environmental change on Antarctic ecosystems (Nielsen & Wall Reference Nielsen and Wall2013), and finally, conservation management becomes possible, also in light of increasing threats from non-indigenous invasive species (Chown et al. Reference Chown, Huiskes, Gremmen, Lee, Terauds, Crosbie, Frenot, Hughes, Imura, Kiefer, Lebouvier, Raymond, Tsujimoto, Ware, van de Vijver and Bergstrom2012a, Reference Chown, Lee, Hughes, Barnes, Barrett, Bergstrom, Convey, Cowan, Crosbie, Dyer, Frenot, Grant, Herr, Kennicutt, Lamers, Murray, Possingham, Reid, Riddle, Ryan, Sanson, Shaw, Sparrow, Summerhayes, Terauds and Wall2012b, Reference Chown, Hodgins, Griffin, Oakeshott, Byrne and Hoffmann2015b). However, knowledge of terrestrial Antarctic biodiversity is still limited because the vast majority of Antarctica’s ice-free areas remain un- or under-studied (McGaughran et al. Reference McGaughran, Stevens, Hogg and Carapelli2011, Convey et al. Reference Convey, Chown, Clarke, Barnes, Bokhorst, Cummings, Ducklow, Frati, Green, Gordon, Griffiths, Howard-Williams, Huiskes, Laybourn-Parry, Lyons, McMinn, Morley, Peck, Quesada, Robinson, Schiaparelli and Wall2014).

Biodiversity research of ice-free habitats in Antarctica is complicated. First, logistic difficulties exacerbated by the harsh environmental conditions may limit biological research to the proximity of research stations, when more extensive field work is required (Convey Reference Convey2010). Second, traditional biodiversity assessments of many multicellular eukaryotes include manual sorting and morphological identification, which are time consuming and require specific taxonomic expertise, especially for the cryptic and inconspicuous terrestrial life of Antarctica (e.g. Velasco-Castrillón et al. Reference Velasco-Castrillón, Gibson and Stevens2014a). Molecular methods are better suited to the study of Antarctic biota (Rogers Reference Rogers2007), but may lack resolution when sequence information is not considered (e.g. the analysis of terminal restriction fragment length polymorphisms or similar techniques; Magalhaes et al. Reference Magalhaes, Stevens, Cary, Ball, Storey, Wall, Tuerk and Ruprecht2012, Makhalanyane et al. Reference Makhalanyane, Valverde, Birkeland, Cary, Tuffin and Cowan2013, Dreesens et al. Reference Dreesens, Lee and Cary2014) or may be labour intensive (e.g. Sanger sequencing; Lawley et al. Reference Lawley, Ripley, Bridge and Convey2004, Fell et al. Reference Fell, Scorzetti, Connell and Craig2006, Velasco-Castrillón & Stevens Reference Velasco-Castrillón and Stevens2014, Velasco-Castrillón et al. Reference Velasco-Castrillón, Page, Gibson and Stevens2014b, Reference Velasco-Castrillón, Schultz, Colombo, Gibson, Davies, Austin and Stevens2014c).

Readily applied in many other parts of the world (reviewed in Bik et al. Reference Bik, Porazinska, Creer, Caporaso, Knight and Thomas2012a, Bohmann et al. Reference Bohmann, Evans, Gilbert, Carvalho, Creer, Knapp, Yu and de Bruyn2014), metabarcoding approaches (sensu Taberlet et al. Reference Taberlet, Coissac, Pompanon, Brochmann and Willerslev2012a) present an opportunity to rapidly generate baseline biodiversity information for a variety of terrestrial Antarctic habitats (Fig. 1; Chown et al. Reference Chown, Hodgins, Griffin, Oakeshott, Byrne and Hoffmann2015b). Metabarcoding approaches use the genetic material from bulk environmental samples such as soil, permafrost, water, ice, snow or other substrates (Bohmann et al. Reference Bohmann, Evans, Gilbert, Carvalho, Creer, Knapp, Yu and de Bruyn2014). Then DNA from multiple organisms contained in such samples are identified for taxonomic analyses either with traditional Sanger sequencing or more recently using high-throughput sequencing (HTS; Chown et al. Reference Chown, Hodgins, Griffin, Oakeshott, Byrne and Hoffmann2015b, Cowan et al. Reference Cowan, Ramond, Makhalanyane and de Maayer2015, Czechowski et al. Reference Czechowski, Clarke, Breen, Cooper and Stevens2016). In a global context, HTS-supported metabarcoding approaches have been applied to monitoring invasive species and surveying biodiversity over large spatial scales (Drummond et al. Reference Drummond, Newcomb, Buckley, Xie, Dopheide, Potter, Heled, Ross, Tooman, Grosser, Park, Demetras, Stevens, Russell, Anderson, Carter and Nelson2015 and reviewed in Bohmann et al. Reference Bohmann, Evans, Gilbert, Carvalho, Creer, Knapp, Yu and de Bruyn2014). In Antarctica, metabarcoding studies, initially based on Sanger sequencing, have enabled the identification of cryptic organisms and communities such as fungi, yeast and invertebrates (Lawley et al. Reference Lawley, Ripley, Bridge and Convey2004, Fell et al. Reference Fell, Scorzetti, Connell and Craig2006). These techniques have also been applied to viruses (López-Bueno et al. Reference López-Bueno, Tamames, Velázquez, Moya, Quesada and Alcamí2009), bacteria in hypolithic communities, soil and air (Makhalanyane et al. Reference Makhalanyane, Valverde, Birkeland, Cary, Tuffin and Cowan2013, Bottos et al. Reference Bottos, Scarrow, Archer, McDonald and Cary2014a, Reference Bottos, Woo, Zawar-Reza, Pointing and Cary2014b), as well as fungal and unicellular eukaryotes of soils (Dreesens et al. Reference Dreesens, Lee and Cary2014, Niederberger et al. Reference Niederberger, Sohm, Gunderson, Parker, Tirindelli, Capone, Carpenter and Cary2015). Additionally, the methodological pitfalls of these techniques when applied in Antarctica have become better understood (Lee et al. Reference Lee, Herbold, Polson, Wommack, Williamson, McDonald and Cary2012b, Czechowski et al. Reference Czechowski, Clarke, Breen, Cooper and Stevens2016), including amplification and sequencing biases, coupled with sparse reference data. Collectively, HTS metabarcoding, despite not being without flaws, provides a promising method to rapidly gather biodiversity information from Antarctic habitats, with the ability to generate large amounts of biodiversity data from a wide range of taxa with simple sample collection, uniform laboratory workflows and comparable data structures.

Fig. 1 Workflow for metabarcoding analyses, which can be applied to soil, snow, ice, cryconite holes, lake sediments or nearshore marine environments. a. Samples are collected. b. The genetic material is extracted in bulk from individual samples. c. DNA contained in extracts is amplified with genetic markers and sequencing adapters, multiplex identifier (MID) tags are added. d. The library is processed on a high-throughput sequencing device. e. After data deconvolution according to sample, reference information assigns individual sequences or sequence clusters with taxonomic information. f. Distributional information becomes available. Picture of sequencing device provided courtesy of Illumina (San Diego, CA, USA). Base layers courtesy of the Scientific Committee on Antarctic Research Antarctic Digital Database.

Here, we provide a technical introduction to HTS metabarcoding with an Antarctic focus and highlight the potential of such approaches for Antarctic biodiversity research beyond their current applications. This synthesis serves as a starting point for the development of Antarctic HTS metabarcoding surveys. We hope to encourage fellow researchers to participate in the joint effort of understanding Antarctica’s biodiversity on a continental scale (Kennicutt et al. Reference Kennicutt, Chown, Cassano, Liggett, Massom, Peck, Rintoul, Storey, Vaughan, Wilson and Sutherland2014).

Technical considerations

Metabarcoding projects are influenced by biases inherent to several methodological aspects. These include: i) extraction of representative DNA from a mixed template sample and the intra- and extracellular DNA contained in such a sample, ii) platform-specific sequencing technologies including inherent sequence error patterns, iii) the appropriate choice of markers, iv) methods for generation and v) amplification of sequencing libraries. Finally, informed approaches to HTS data processing and analysis are necessary to achieve research goals.

Sample selection

As shown in a variety of global studies (reviewed in Bohmann et al. Reference Bohmann, Evans, Gilbert, Carvalho, Creer, Knapp, Yu and de Bruyn2014), it is possible to extract DNA suitable for metabarcoding analyses from a variety of substrates, which offers a unique opportunity to study different environments in Antarctica (Fig. 1a). DNA can be extracted from organisms contained in surface soil (Czechowski et al. Reference Czechowski, Clarke, Breen, Cooper and Stevens2016), permafrost (Bellemain et al. Reference Bellemain, Davey, Kauserud, Epp, Boessenkool, Coissac, Geml, Edwards, Willerslev, Gussarova, Taberlet and Brochmann2013), snow (Dalén et al. Reference Dalén, Götherström, Meijer and Shapiro2007), ice (Willerslev et al. Reference Willerslev, Hansen and Poinar2004), freshwater benthos of lakes (Hajibabaei et al. Reference Hajibabaei, Spall, Shokralla and van Konynenburg2012) or nearshore marine sediments (Powell et al. Reference Powell, Bowman, Snape and Stark2003). Furthermore, extracts of pre-sorted samples can be analysed (Drummond et al. Reference Drummond, Newcomb, Buckley, Xie, Dopheide, Potter, Heled, Ross, Tooman, Grosser, Park, Demetras, Stevens, Russell, Anderson, Carter and Nelson2015), such as from museum collections. When limited starting material is available, preservatives such as ethanol can be used as a DNA source (Shokralla et al. Reference Shokralla, Singer and Hajibabaei2010). DNA can also be extracted from faeces (Jarman et al. Reference Jarman, McInnes, Faux, Polanowski, Marthick, Deagle, Southwell and Emmerson2013), for example the seal and penguin colonies in coastal regions of Antarctica and sub-Antarctic islands. The variety of potential sample types, coupled with cost-effective sequence data generation, could address problems related to surveying large spatial or temporal scales.

Extraction of environmental samples

Failure to extract representative DNA from a sample (Fig. 1b), so-called ‘extraction bias’ (Pedersen et al. Reference Pedersen, Overballe-Petersen, Ermini, Sarkissian, Haile, Hellstrom, Spens, Thomsen, Bohmann, Cappellini, Schnell, Wales, Caroe, Campos, Schmidt, Gilbert, Hansen, Orlando and Willerslev2014), is a major concern for metabarcoding approaches. Such biases occur when extraction methods inconsistently lyse cells of different organisms, and are further biased by the presence of dead organisms’ DNA in substrates (Pedersen et al. Reference Pedersen, Overballe-Petersen, Ermini, Sarkissian, Haile, Hellstrom, Spens, Thomsen, Bohmann, Cappellini, Schnell, Wales, Caroe, Campos, Schmidt, Gilbert, Hansen, Orlando and Willerslev2014). Some authors explicitly distinguish an intracellular DNA component from an extracellular DNA component in bulk extracts, and present methods to quantify both fractions in a given sample (Ascher et al. Reference Ascher, Ceccherini, Pantani, Agnelli, Borgogni, Guerri, Nannipieri and Pietramellara2009). Yet, applying such approaches across large numbers of samples may be cost-prohibitive. Alleviating DNA extraction biases can be achieved through combining different extraction methods, and include blending samples prior to extraction and/or using a large amount of starting material (Delmont et al. Reference Delmont, Robe, Cecillon, Clark, Constancias, Simonet, Hirsch and Vogel2011, Reference Delmont, Simonet and Vogel2013, Taberlet et al. Reference Taberlet, Prud’Homme, Campione, Roy, Miquel, Shehzad, Gielly, Rioux, Choler, Clement, Melodelima, Pompanon and Coissac2012b). Yet, different extraction methods or batch-wise application of one extraction method may introduce variable levels of non-template contamination (Salter et al. Reference Salter, Cox, Turek, Calus, Cookson, Moffatt, Turner, Parkhill, Loman and Walker2014). Therefore, randomized drawing of sample batches is recommended (Salter et al. Reference Salter, Cox, Turek, Calus, Cookson, Moffatt, Turner, Parkhill, Loman and Walker2014). Extraction biases and contamination can be discovered by inclusion of negative and positive controls. Negative controls facilitate the detection of contamination. Positive controls of known taxonomic composition are helpful in detecting compositional deviations between the sequence data and sample source (Salter et al. Reference Salter, Cox, Turek, Calus, Cookson, Moffatt, Turner, Parkhill, Loman and Walker2014, Czechowski et al. Reference Czechowski, Clarke, Breen, Cooper and Stevens2016). Consequently, both positive and negative controls help to optimize the DNA extraction process and are helpful in streamlining processing parameters in steps following extraction.

High-throughput sequencing platforms

The recent advance of HTS-supported metabarcoding, metagenetics and metagenomics (Bik et al. Reference Bik, Porazinska, Creer, Caporaso, Knight and Thomas2012a, Taberlet et al. Reference Taberlet, Coissac, Pompanon, Brochmann and Willerslev2012a, Bohmann et al. Reference Bohmann, Evans, Gilbert, Carvalho, Creer, Knapp, Yu and de Bruyn2014) can be considered a consequence of continuing development of sequencing platforms by companies such as 454 (Roche, Basel, official platform support was discontinued in 2016), Illumina (San Diego, CA, USA), IonTorrent (Thermo Fisher Scientific, Waltham, MA, USA) and others since 2005 (Glenn Reference Glenn2011, van Dijk et al. Reference Van Dijk, Auger, Jaszczyszyn and Thermes2014). These devices generate substantially larger amounts of sequencing data than chain-termination sequencing (Bohmann et al. Reference Bohmann, Evans, Gilbert, Carvalho, Creer, Knapp, Yu and de Bruyn2014), but in comparison produce shorter reads (i.e.~100–800 base pairs, depending on the technology). Using these platforms in conjunction with metabarcoding (and metagenomic) approaches removes the need to process mixed DNA templates through clone libraries and hence substantially reduces the time to data generation. The platform of choice to conduct metabarcoding biodiversity surveys currently appears to be one of the Illumina platforms, due to the large number of sequences generated which reduces the cost per base and the comparatively low error rate of this sequencing platform (Bokulich et al. Reference Bokulich, Subramanian, Faith, Gevers, Gordon, Knight, Mills and Caporaso2013, Bragg et al. Reference Bragg, Stone, Butler, Hugenholtz and Tyson2013). The discontinued 454 platform, although often comparatively expensive to use, will continue for some time to offer the longest read lengths of all platforms suitable for amplicon sequencing (Van Dijk et al. Reference Van Dijk, Auger, Jaszczyszyn and Thermes2014). Comprehensive reviews of HTS platforms are provided in Glenn (Reference Glenn2011) and van Dijk et al. (Reference Van Dijk, Auger, Jaszczyszyn and Thermes2014). Currently, the most common and cost-effective approach to generate metabarcoding information with HTS is parallel sequencing of PCR-amplified bulk DNA extracts, known as ‘amplicon sequencing’ (Taberlet et al. Reference Taberlet, Coissac, Pompanon, Brochmann and Willerslev2012a, Bohmann et al. Reference Bohmann, Evans, Gilbert, Carvalho, Creer, Knapp, Yu and de Bruyn2014). Important methodological aspects of amplicon sequencing are described below. We also describe how pitfalls of amplicon sequencing can be alleviated and present alternative methods for library generation and sequencing.

Marker choice

Markers for PCR amplification of mixed DNA templates extracted from environmental samples (e.g. soil (Fig. 1c), permafrost, water, ice, snow, etc.) should i) ideally amplify all taxa with similar efficiency despite potential mismatches between primers and the variety of template molecules (Clarke et al. Reference Clarke, Soubrier, Weyrich and Cooper2014a), ii) amplify target regions short enough to allow amplification of degraded DNA, particularly if targeting extracellular DNA (Riaz et al. Reference Riaz, Shehzad, Viari, Pompanon, Taberlet and Coissac2011, Coissac et al. Reference Coissac, Riaz and Puillandre2012, Taberlet et al. Reference Taberlet, Coissac, Pompanon, Brochmann and Willerslev2012a), iii) exhibit the least possible amount of degenerate bases to allow the application of high annealing temperatures, while decreasing the risk of chimeric amplification (Lenz & Becker Reference Lenz and Becker2008, Ahn et al. Reference Ahn, Kim, Song and Weon2012) and iv) target a gene region for which ample reference data are available to allow taxonomic identification of phylotypes (see below).

Finding a primer pair that possesses these desirable, possibly incompatible, qualities is challenging. Two genes that have been widely applied in single-gene and metabarcoding analyses of metazoans, for example, are the nuclear 18S ribosomal DNA (18S rDNA) and mitochondrial cytochrome c oxidase subunit I (COI) genes (Wu et al. Reference Wu, Ayres, Bardgett, Wall and Garey2011, Zhan et al. Reference Zhan, Bailey, Heath and Macisaac2014). These markers are favoured due to their long history of application, resulting in comparatively abundant reference data in sequence repositories such as GenBank, BOLD and SILVA (Pruesse et al. Reference Pruesse, Quast, Knittel, Fuchs, Ludwig, Peplies and Gloeckner2007, Ratnasingham & Hebert Reference Ratnasingham and Hebert2007, Benson et al. Reference Benson, Karsch-Mizrachi, Lipman, Ostell and Sayers2011) (Fig. 1e & f). However, 18S rDNA data may underestimate biodiversity due to low taxonomic resolution, and many COI markers show inherent taxonomic bias due to insufficiently conserved primer binding sites across broad taxonomic groups (Tang et al. Reference Tang, Leasi, Obertegger, Kieneke, Barraclough and Fontaneto2012, Deagle et al. Reference Deagle, Jarman, Coissac, Pompanon and Taberlet2014). Similar advantages and disadvantages are found analogously in other marker regions applied in metabarcoding studies, for example when targeting fungi using the ITS region or photosynthetic cryptogams via the matK and chloroplast genes (CBOL Plant Working Group 2009, Orgiazzi et al. Reference Orgiazzi, Bianciotto, Bonfante, Daghino, Ghignone, Lazzari, Lumini, Mello, Napoli, Perotto, Vizzini, Bagella, Murat and Girlanda2013, Drummond et al. Reference Drummond, Newcomb, Buckley, Xie, Dopheide, Potter, Heled, Ross, Tooman, Grosser, Park, Demetras, Stevens, Russell, Anderson, Carter and Nelson2015).

Library generation

Preparing DNA for HTS requires the addition of platform-specific sequencing adapters, and often (particularly for metabarcoding) sample-specific sequence tags (or ‘multiplex identifier’ (MID) tags) are also required to enable deconvolution of sequence data (Fig. 1c & e). Initially, DNA pools were furnished with MID tags during PCR (Saiki et al. Reference Saiki, Gelfand, Stoffel, Scharf, Higuchi, Horn, Mullis and Erlich1988) via extended primer sequences or ligation of unmodified primers preceding sequence adapter ligation (Binladen et al. Reference Binladen, Gilbert, Bollback, Panitz, Bendixen, Nielsen and Willerslev2007, Meyer et al. Reference Meyer, Stenzel and Hofreiter2008). More recently, library generation via long primer sequences carrying both sequence adaptor and MID tags (fusion primers) has become common (Bik et al. Reference Bik, Porazinska, Creer, Caporaso, Knight and Thomas2012a). The application of fusion primers is practical in that it only requires a single PCR, but may be costly for large numbers of samples and difficult for primer lengths above ~50 base pairs due to poor PCR performance. In those cases, more labour intensive ligation protocols may be a better choice (Stiller et al. Reference Stiller, Knapp, Stenzel, Hofreiter and Meyer2009, Kircher et al. Reference Kircher, Sawyer and Meyer2012, O’Neill et al. Reference O’Neill, Schwartz, Bullock, Williams, Shaffer, Aguilar-Miguel, Parra-Olea and Weisrock2013). Also of concern is the informed choice of MID tags. Owing to possible flaws in the underlying algorithms, these tags may not meet the intended expectations of robustness towards sequencing errors (Faircloth & Glenn Reference Faircloth and Glenn2012). Only MID tags that have been explicitly tested for correct Hamming distances (Hamming Reference Hamming1950) are recommended, and this will later enable correct deconvolution and error correction (Faircloth & Glenn Reference Faircloth and Glenn2012).

Amplification

Concordance between the taxonomic composition of a mixed DNA template retrieved from environmental bulk samples and the amplified library requires careful calibration of PCR conditions, for example, length optimization of denaturation, annealing and extension steps as well as the correct temperatures for the primer annealing phase (Fig. 1c). Possible pitfalls include i) introduction of substitutions and insertion/deletions through polymerase activity (Cline et al. Reference Cline, Braman and Hogrefe1996), ii) formation of chimeric molecules in late amplification stages (Kanagawa Reference Kanagawa2003), iii) amplification bias when using degenerate primers in combination with high annealing temperatures (Cline et al. Reference Cline, Braman and Hogrefe1996, Kanagawa Reference Kanagawa2003), and iv) failure to detect rare variants when little replication is applied (Ficetola et al. Reference Ficetola, Pansu, Bonin, Coissac, Giguet-Covex, De Barba, Gielly, Lopes, Boyer, Pompanon, Raye and Taberlet2015). Such pitfalls collectively threaten the credibility of the resulting sequence data (Czechowski et al. Reference Czechowski, Clarke, Breen, Cooper and Stevens2016). They may i) alter the similarity of phylotypes to reference sequences, ii) result in artificial phylotypes that match several reference sequences, iii) artificially enrich phylotypes whose library molecules matched the PCR primers well or iv) result in false-negative concealment of phylotypes. Retrieval of higher quality data can be achieved by i) application of proofreading polymerases (Taberlet et al. Reference Taberlet, Coissac, Pompanon, Brochmann and Willerslev2012a), ii) using few and long PCR cycles (Kanagawa Reference Kanagawa2003, Lenz & Becker Reference Lenz and Becker2008, Ahn et al. Reference Ahn, Kim, Song and Weon2012), iii) careful testing of annealing temperature (Sipos et al. Reference Sipos, Székely, Palatinszky, Révész, Márialigeti and Nikolausz2007) and iv) processing three or more PCR replicates (Gilbert et al. Reference Gilbert, Meyer, Antonopoulos, Balaji, Brown, Desai, Eisen, Evers, Field, Feng, Huson, Jansson, Knight, Knight, Kolker, Konstantindis, Kostka, Kyrpides, Mackelprang, McHardy, Quince, Raes, Sczyrba, Shade and Stevens2010). Analogous to the extraction step, positive and negative controls are important to track contamination during the amplification procedure (Czechowski et al. Reference Czechowski, Clarke, Breen, Cooper and Stevens2016). At the same time, positive controls may be a source of cross-contamination, for example, through unintended PCR-product carry-over (Kwok Reference Kwok1990). Using suitable non-Antarctic control DNA, which can be distinguished from sample DNA in later analysis steps, could reduce the impact of PCR-product carry-over.

Sequence analysis

Most importantly, it needs to be noted that processing HTS data (Fig. 1e & f) is not straightforward and requires a high level of bioinformatics expertise, project-specific software selection and software fine-tuning at every step. To perform a metabarcoding analysis with any given raw dataset, first an analysis workflow needs to be conceptualized. Then, a variety of software algorithms need to be selected with regard to the analysis steps and study goals, keeping in mind available computing hardware, methods of library design, employed sequencing technology, data volume and analysis pitfalls (Coissac et al. Reference Coissac, Riaz and Puillandre2012, Lee et al. Reference Lee, Herbold, Polson, Wommack, Williamson, McDonald and Cary2012b). Subsequently, testing programs individually and in order of application using small datasets is advisable. Here, it may be necessary to generate custom (or at least modify existing) scripts, through which data input and output of algorithms is handled and connected.

It is possible that the resulting metabarcoding analysis workflow is initiated by marker selection (Riaz et al. Reference Riaz, Shehzad, Viari, Pompanon, Taberlet and Coissac2011), and once sequence data has been generated, several raw data processing steps will follow before the statistical analysis can be attempted (Bik et al. Reference Bik, Porazinska, Creer, Caporaso, Knight and Thomas2012a, Bohmann et al. Reference Bohmann, Evans, Gilbert, Carvalho, Creer, Knapp, Yu and de Bruyn2014). Raw data preparation typically includes quality filtering, removal of sequence adapters, data deconvolution and chimera removal. The clean raw data are then typically clustered, assigned with taxonomy, and subsequently, the resulting data are checked for their suitability for the intended statistical analysis.

Although raw data preparation can be achieved with a variety of programs (see Table I for examples), software environments dedicated to metabarcoding analysis such as QIIME, MOTHUR and MG-RAST (Schloss et al. Reference Schloss, Westcott, Ryabin, Hall, Hartmann, Hollister, Lesniewski, Oakley, Parks, Robinson, Sahl, Stres, Thallinger, van Horn and Weber2009, Caporaso et al. Reference Caporaso, Kuczynski and Stombaugh2010, Wilke et al. Reference Wilke, Bischof, Gerlach, Glass, Harrison, Keegan, Paczian, Trimble, Bagchi, Gram, Chaterji and Meyer2016) offer functionality incorporating whole analysis workflows starting from raw data cleaning, phylotype clustering and basic statistical analyses. These metabarcoding/metagenomic software environments themselves usually take advantage of multiple algorithms dedicated to particular sub-routines of analysis workflows. For example, chimera detection may be achieved with UPARSE (Edgar Reference Edgar2013) in QIIME. Taxonomic assignments may be retrieved with BLAST (Altschul et al. Reference Altschul, Gish, Miller, Myers and Lipman1990) or other algorithms such as the RDP classifier, or UCLUST (Edgar Reference Edgar2010, Lan et al. Reference Lan, Wang, Cole and Rosen2012). In general, sub-algorithms employed by software suites need to be carefully considered before attempting data preparation and analysis steps. An overview of raw data preparation software is provided by Zhou & Rokas (Reference Zhou and Rokas2014).

Table I Selection of analysis software for metabarcoding data of environmental DNA. Possible tasks related to handling of metabarcoding data provided in columns. Multi-purpose tools in top section are suitable for various sequence analysis tasks. Software environments in the middle section are specialized for metabarcoding analysis. Programs and packages in the bottom section are focussed on ecological analysis and/or visualizing results. X=functionality of a given tool for a given task is provided, (X)=functionality of a given tool for a given task is provided to some extent, -=functionality is not provided.

If dedicated metabarcoding/metagenomic analysis environments do not offer desired functionalities for analysis and visualization, some analyses can be achieved through other available software and possibly linked in via ‘glue code’ written in programming languages such as R (R Development Team 2016), BASH or PYTHON (Van Rossum & Drake Reference Van Rossum and Drake1995). EXPLICET (Robertson et al. Reference Robertson, Harris, Wagner, Granger, Browne, Tatem, Feazel, Park, Pace and Frank2013), for example, offers basic visualization and statistical analysis functionally coupled with a graphical user interface, suitable for novice users. More powerful, but command-driven, the R environment offers several packages for the statistical analysis and visualization of metabarcoding data with packages such as PHYLOSEQ or VEGAN (McMurdie & Holmes Reference McMurdie and Holmes2013, Oksanen et al. Reference Oksanen, Blanchet, Kindt, Legendre, Minchin, O’Hara, Simpson, Solymos, Stevens and Wagner2015, R Development Team 2016) and many others. An in-depth review of metabarcoding and metagenomic sequence analysis software is provided by Lindgreen et al. (Reference Lindgreen, Adair and Gardner2016).

Biodiversity surveys often seek to quantify α-diversity (species richness) and β-diversity (change in community composition; Whittaker Reference Whittaker1960). Three pitfalls should be carefully considered when designing metabarcoding analysis workflows aiming at α- and β-diversity comparisons between a given set of samples, and while relating species occurrences to, for example, their environment. All analyses require a) sufficient sequencing depth, b) α- and β-diversity comparisons require appropriate abundance correction of libraries with different read depths and c) distance-based ordination techniques can only be applied appropriately when using the correct distance measure. Regarding a) above, it is important to realize that sequencing depth of HTS libraries is crucial for the reliable estimation of biodiversity measures in the resulting data (Smith & Peay Reference Smith and Peay2014); while in order to retrieve reliable biodiversity estimates from Antarctic habitats, a high sequencing effort may be necessary to retrieve credible statistical results (Czechowski et al. Reference Czechowski, Clarke, Breen, Cooper and Stevens2016). Regarding b) above, the analysis of differentially abundant phylotypes in metabarcoding data, by comparisons of proportions or rarefied counts, although applied widely, is inappropriate and may yield misleading results (McMurdie & Holmes Reference McMurdie and Holmes2014). Instead of applying such rarefaction methods, other algorithms should be employed to enable comparison between libraries with coverage differences. For instance, R packages DESeq and edgeR offer alternative ways to correct phylotype abundance (Robinson et al. Reference Robinson, McCarthy and Smyth2009, Anders & Huber Reference Anders and Huber2010). Regarding c) above, scarcity of biological data is known to impair ecological statistical analysis in Antarctica due to the low spatial overlap of individual phylotypes (Magalhaes et al. Reference Magalhaes, Stevens, Cary, Ball, Storey, Wall, Tuerk and Ruprecht2012, Czechowski et al. Reference Czechowski, Clarke, Breen, Cooper and Stevens2016). Consequently, metabarcoding data from many Antarctic habitats is likely to be difficult to analyse with commonly used distance-based ordination methods including multidimensional scaling (Wish & Carroll Reference Wish and Carroll1982), constrained analysis of principal components (CAP) and redundancy analysis (RDA) (Legendre & Andersson Reference Legendre and Andersson1999). Hence, when employing distance-based ordination approaches, sample comparison may only be possible with metrics established to be suitable, for example the Hellinger distance (Gagné & Proulx Reference Gagné and Proulx2009). Alternatively, several different distance metrics should be compared (Blanchet et al. Reference Blanchet, Legendre, Bergeron and He2014). Model-based ordination methods may circumvent drawbacks of distance-based ordinations, and should be used where possible (Ellis et al. Reference Ellis, Smith and Pitcher2012, Wang et al. Reference Wang, Naumann, Wright and Warton2012, Hui et al. Reference Hui, Taskinen, Pledger, Foster and Warton2015).

Recent improvements of high-throughput sequencing metabarcoding

Retrieving biodiversity information from hundreds of samples over large spatial or temporal scales requires cost-efficient processing. Tagging individual samples with fusion primers for amplicon sequencing is simple, but increases the cost of HTS metabarcoding studies for large-scale approaches. Presumably for this reason, numbers of parallel processed samples in several recent global and Antarctic metabarcoding studies range from seven to twelve samples (Bik et al. Reference Bik, Sung, De Ley, Baldwin, Sharma, Rocha-Olivares and Thomas2012b, Roesch et al. Reference Roesch, Fulthorpe, Pereira, Pereira, Lemos, Barbosa, Suleiman, Gerber, Pereira, Loss and da Costa2012, Dreesens et al. Reference Dreesens, Lee and Cary2014, Niederberger et al. Reference Niederberger, Sohm, Gunderson, Parker, Tirindelli, Capone, Carpenter and Cary2015). Reducing primer-associated costs is possible through modular combination of multiple sequence tags per sample, thus reducing the amount of unique oligonucleotides required for a project. Examples of such modular workflows include using two PCRs to double-tag amplicons for HTS (Bybee et al. Reference Bybee, Bracken-Grissom, Haynes, Hermansen, Byers, Clement, Udall, Wilcox and Crandall2011, de Cárcer et al. Reference De Cárcer, Denman, McSweeney and Morrison2011). Similarly, double-tagging can generate amplicons with minimal work, handling and cost in a single PCR (Clarke et al. Reference Clarke, Czechowski, Soubrier, Stevens and Cooper2014b).

The PCR biases during library preparation can be alleviated through the application of hybridization approaches. In hybridization approaches, libraries are generated by annealing target DNA to biotinylated oligonucleotide probes (Gnirke et al. Reference Gnirke, Melnikov, Maguire, Rogov, LeProust, Brockman, Fennell, Giannoukos, Fisher, Russ, Gabriel, Jaffe, Lander and Nusbaum2009, Faircloth et al. Reference Faircloth, McCormack, Crawford, Harvey, Brumfield and Glenn2012, Lemmon et al. Reference Lemmon, Emme and Lemmon2012). In comparison to PCR, hybridization approaches enable retrieval of multiple conserved regions per reaction, perform well in detecting rare DNA and reduce compositional biases in the resulting data without the need for extensive replication (Taberlet et al. Reference Taberlet, Coissac, Pompanon, Brochmann and Willerslev2012a). For example, Denonfoux et al. (Reference Denonfoux, Parisot, Dugat-Bony, Biderre-Petit, Boucher, Morgavi, Le Paslier, Peyretaillade and Peyret2013) sequenced bacterial DNA derived from environmental samples after enrichment with a hybridization approach, demonstrating the benefits outlined here for mixed template DNA sources.

The lengths of genomic regions that can be targeted with single read lengths of a given HTS platform are variable (see section ‘High-throughput sequencing platforms’), but usually shorter than the 600–1000 base pairs that can be achieved from a single read using Sanger sequencing technology. Therefore, recent research has investigated the options of adopting shorter fragments of regions that have been used widely in Sanger sequencing, for example, the beginning of the COI gene region or the 18S gene (Machida & Knowlton Reference Machida and Knowlton2012, Leray et al. Reference Leray, Yang, Meyer, Mills, Agudelo, Ranwez, Boehm and Machida2013). Other studies have identified new marker regions with short read lengths suitable for HTS technologies, which retain adequate information allowing comparisons with data from the traditional markers (CBOL Plant Working Group 2009, Epp et al. Reference Epp, Boessenkool, Bellemain, Haile, Esposito, Riaz, Erseus, Gusarov, Edwards, Johnsen, Stenoien, Hassel, Kauserud, Yoccoz, Brathen, Willerslev, Taberlet, Coissac and Brochmann2012). Furthermore, identification of custom marker regions is now possible with bioinformatics tools such as ecoPrimers incorporated into OBI tools (Riaz et al. Reference Riaz, Shehzad, Viari, Pompanon, Taberlet and Coissac2011, Boyer et al. 2016) (see Table I). ecoPrimers employs user-curated reference data retrieved from repositories such as GenBank (Benson et al. Reference Benson, Karsch-Mizrachi, Lipman, Ostell and Sayers2011) to identify conserved regions suitable for project-specific primer design for mixed template amplification.

Other methods of streamlining metabarcoding approaches with regard to data yields and cost efficiency have become available. A combination of shotgun sequencing methods and amplicon sequencing, for instance, allow retrieval of full length COI sequences using HTS technology (Liu et al. Reference Liu, Li, Lu, Su, Tang, Zhang, Zhou, Zhou, Yang, Ji, Yu and Zhou2013). Furthermore, the omission of library quantification, and instead pooling libraries by volume (coupled with shearing and re-assembly; Feng et al. Reference Feng, Liu, Chen, Liang and Zhang2015), can reduce time and effort during library construction. Finally, with decreasing sequencing costs, metagenomic studies targeting the entirety of DNA molecules in a sample, including functional genes, without selective amplification or enrichment (Fierer et al. Reference Fierer, Leff, Adams, Nielsen, Bates, Lauber, Owens, Gilbert, Wall and Caporaso2012) may become viable for large sample numbers.

The potential of metabarcoding and metagenomics for Antarctic biology

Elucidating community structures

Community-level interaction is an important feature of Antarctic ecosystems. Such interactions were believed to be minimal, perhaps owing to the fact that they are hard to measure (Hogg et al. Reference Hogg, Cary, Convey, Newsham, O’Donnell, Adams, Aislabie, Frati, Stevens and Wall2006). However, biotic community-level interactions are increasingly implicated in facilitating survival in harsh environments, and may be observable through stratified occurrence of different organisms or the exchange of nutrients between strata within communities (Nakai et al. Reference Nakai, Abe, Baba, Imura, Kagoshima, Kanda, Kohara, Koi, Niki, Yanagihara and Naganuma2012, Pointing & Belnap Reference Pointing and Belnap2012). Community-level organization has been discovered among Antarctic soil crusts, lithobiontic communities, eukaryotes in moss pillars and cyanobacterial mats (Jungblut et al. Reference Jungblut, Vincent and Lovejoy2012, Nakai et al. Reference Nakai, Abe, Baba, Imura, Kagoshima, Kanda, Kohara, Koi, Niki, Yanagihara and Naganuma2012, Makhalanyane et al. Reference Makhalanyane, Valverde, Birkeland, Cary, Tuffin and Cowan2013, Colesie et al. Reference Colesie, Gommeaux, Green and Bueddel2014). Evidence for biotic interactions has also been reported among soil arthropods of sub-Antarctic islands (Caruso et al. Reference Caruso, Trokhymets, Bargagli and Convey2013).

Studies describing the community-level organization of Antarctic terrestrial ecosystems will benefit from metabarcoding and metagenomic approaches. Possible studies could include further analyses of prokaryotic and eukaryotic diversity in substrates such as snow, soil crusts and hypolithons, photobiotic and mycobiotic diversity and biogeography of lichen, or the association between fungi and eukaryotes in moss communities, that are still often studied using Sanger sequencing (Carpenter et al. Reference Carpenter, Lin and Capone2000, Fernández-Mendoza et al. Reference Fernández-Mendoza, Domaschke, García, Jordan, Martín and Printzen2011, Khan et al. Reference Khan, Tuffin, Stafford, Cary, Lacap, Pointing and Cowan2011, Jungblut et al. Reference Jungblut, Vincent and Lovejoy2012, Gokul et al. Reference Gokul, Valverde, Tuffin, Cary and Cowan2013, Altermann et al. Reference Altermann, Leavitt, Goward, Nelsen and Lumbsch2014). The HTS-supported analysis of such communities is becoming more common for eukaryotes and bacteria, e.g. in cyanobacterial mats and hypolithic communities (Lee et al. Reference Lee, Barbier, Bottos, McDonald and Cary2012a, Dreesens et al. Reference Dreesens, Lee and Cary2014, Niederberger et al. Reference Niederberger, Sohm, Gunderson, Parker, Tirindelli, Capone, Carpenter and Cary2015). However, similar approaches could be applied to environments such as air (Bottos et al. Reference Bottos, Woo, Zawar-Reza, Pointing and Cary2014b) or nearshore sediments (Powell et al. Reference Powell, Bowman, Snape and Stark2003). Functional aspects of soil microbial communities have been investigated using metagenomic HTS approaches, providing an in-depth picture of ecosystem services (Fierer et al. Reference Fierer, Leff, Adams, Nielsen, Bates, Lauber, Owens, Gilbert, Wall and Caporaso2012). Similarly, HTS-based metagenomic studies advanced the description of morphologically conserved, rare or small cryptic communities in Antarctica, including their provision of ecosystem services (Goordial et al. Reference Goordial, Davila, Greer, Cannam, DiRuggiero, McKay and Whyte2016).

Supporting conservation of Antarctica

The biodiversity and distribution of the terrestrial Antarctic biota is more heterogeneous than anywhere else in the world (Ettema & Wardle Reference Ettema and Wardle2002, Convey et al. Reference Convey, Chown, Clarke, Barnes, Bokhorst, Cummings, Ducklow, Frati, Green, Gordon, Griffiths, Howard-Williams, Huiskes, Laybourn-Parry, Lyons, McMinn, Morley, Peck, Quesada, Robinson, Schiaparelli and Wall2014, Chown et al. Reference Chown, Clarke, Fraser, Cary, Moon and McGeoch2015a). Large distances between habitats, unique geological and glacial histories, different soil compositions and extreme fluctuations of abiotic conditions amplify this heterogeneity (Bockheim Reference Bockheim1997, Marchant & Head Reference Marchant and Head2007, Bintanja et al. Reference Bintanja, Severijns, Haarsma and Hazeleger2014). Consequently, Antarctic biota exhibit a high degree of endemism and costly adaptation mechanisms to withstand harsh environmental conditions (Convey Reference Convey1997, Convey & Stevens Reference Convey and Stevens2007). Human-mediated environmental changes are anticipated to have profound effects on the spatial extent and structure of Antarctic terrestrial ecosystems (Chown et al. Reference Chown, Huiskes, Gremmen, Lee, Terauds, Crosbie, Frenot, Hughes, Imura, Kiefer, Lebouvier, Raymond, Tsujimoto, Ware, van de Vijver and Bergstrom2012a, Reference Chown, Lee, Hughes, Barnes, Barrett, Bergstrom, Convey, Cowan, Crosbie, Dyer, Frenot, Grant, Herr, Kennicutt, Lamers, Murray, Possingham, Reid, Riddle, Ryan, Sanson, Shaw, Sparrow, Summerhayes, Terauds and Wall2012b). Despite a high degree of isolation between continental habitats (Convey et al. Reference Convey, Chown, Clarke, Barnes, Bokhorst, Cummings, Ducklow, Frati, Green, Gordon, Griffiths, Howard-Williams, Huiskes, Laybourn-Parry, Lyons, McMinn, Morley, Peck, Quesada, Robinson, Schiaparelli and Wall2014), the distribution patterns of Antarctic species may shift southwards and increasingly overlap, possibly eroding the extensive endemism among many Antarctic species (Nielsen & Wall Reference Nielsen and Wall2013), particularly when considering human-mediated dispersal. Additionally, non-indigenous species may outcompete local endemics in an increasingly accommodating environment, particularly in the sub-Antarctic (Frenot et al. Reference Frenot, Chown, Whinam, Selkirk, Convey, Skotnicki and Bergstrom2005, Hughes & Convey Reference Hughes and Convey2010, Hughes et al. Reference Hughes, Convey, Maslen and Smith2010).

Current Antarctic biology is primarily influenced by the desire to conserve the unique and still largely uncharacterized biodiversity of the continent and surrounding islands. Elucidating distribution patterns of terrestrial communities and identifying biotic elements most vulnerable to climate change have been deemed some of the most important goals of Antarctic biological conservation (Kennicutt et al. Reference Kennicutt, Chown, Cassano, Liggett, Massom, Peck, Rintoul, Storey, Vaughan, Wilson and Sutherland2014). Definition and extension of protected areas in Continental Antarctica, particularly in remote locations, is urgently required (Terauds et al. Reference Terauds, Chown, Morgan, Peat, Watts, Keys, Convey and Bergstrom2012, Shaw et al. Reference Shaw, Terauds, Riddle, Possingham and Chown2014), coupled with increased monitoring of these areas for the introduction of taxa from the sub- and Maritime Antarctic (Chown et al. Reference Chown, Huiskes, Gremmen, Lee, Terauds, Crosbie, Frenot, Hughes, Imura, Kiefer, Lebouvier, Raymond, Tsujimoto, Ware, van de Vijver and Bergstrom2012a, Shaw et al. Reference Shaw, Terauds, Riddle, Possingham and Chown2014, McGeoch et al. Reference McGeoch, Shaw, Terauds, Lee and Chown2015). Efforts to capture heterogeneous patterns in terrestrial biodiversity, and to assess the future impact of alien species, require densely spaced biological and environmental survey data (Shaw et al. Reference Shaw, Terauds, Riddle, Possingham and Chown2014, McGeoch et al. Reference McGeoch, Shaw, Terauds, Lee and Chown2015). The HTS-supported molecular methods are particularly powerful in resolving Antarctic endemics from non-indigenous species that are not easily detected or are difficult to identify (Hughes & Convey Reference Hughes and Convey2012, Chown et al. Reference Chown, Hodgins, Griffin, Oakeshott, Byrne and Hoffmann2015b); such methods could inform, for example, on the number of eukaryotic alien and invasive species per biogeographical region in standardized frameworks (McGeoch et al. Reference McGeoch, Shaw, Terauds, Lee and Chown2015).

Continent-wide survey data and time series monitoring

The HTS-based metabarcoding approaches are regarded as more efficient compared with morphological methods for assessing the ecological integrity and health of diverse marine and terrestrial environments, by providing a uniform, swift and economical means of species identification (Aylagas et al. Reference Aylagas, Borja and Rodríguez-Ezpeleta2014, Drummond et al. Reference Drummond, Newcomb, Buckley, Xie, Dopheide, Potter, Heled, Ross, Tooman, Grosser, Park, Demetras, Stevens, Russell, Anderson, Carter and Nelson2015). Potential applications to Antarctic environments are now being realized, where HTS-based metabarcoding studies similarly offer a simple, cost-efficient workflow and rich sequence information that can be easily combined or re-analysed in more detailed integrative studies (Gutt et al. Reference Gutt, Zurell, Bracegridle, Cheung, Clark, Convey, Danis, David, De Broyer, di Prisco, Griffiths, Laffont, Peck, Pierrat, Riddle, Saucede, Turner, Verde, Wang and Grimm2012, Chown et al. Reference Chown, Hodgins, Griffin, Oakeshott, Byrne and Hoffmann2015b).

In order to use the full potential of HTS for Antarctic biodiversity research and ecology, we suggest i) designing studies with close consideration of research goals defined by the international community (Kennicutt et al. Reference Kennicutt, Chown and Cassano2015), ii) designing studies with larger numbers of samples, for example, across variable spatial and temporal scales, similar to approaches used by Dornelas et al. (Reference Dornelas, Gotelli, McGill, Shimadzu, Moyes, Sievers and Magurran2014) and Howard-Williams et al. (Reference Howard-Williams, Peterson, Lyons, Cattaneo-Vietti and Gordon2006) or contributing towards such efforts, iii) using a variety of DNA sources for analysis, including historical material from museum collections or historical Antarctic voyages (Headland Reference Headland2009), iv) providing well-documented analysis code with all published HTS data, v) further developing laboratory and analysis protocols for metabarcoding and metagenomic approaches suitable to investigate Antarctic habitats, and finally, vi) generating reference DNA sequences for Antarctic species identification using α taxonomic (including morphological) approaches (Turrill Reference Turrill1938).

Summary and conclusions

Metabarcoding analysis of mixed template and environmental DNA is a valuable option to describe the composition and distribution of the cryptic and heterogeneously distributed terrestrial biota of Antarctica. Metabarcoding and metagenomic approaches have proven helpful in describing bacterial and hypolithic communities in ice-free regions of Antarctica and could similarly be applied to many other taxa on the continent, including communities inhabiting snow and ice, as well as lake and marine sediments. In comparison to traditional molecular methods, HTS-based approaches yield large amounts of detailed data with relatively simple and time-efficient laboratory workflows, coupled with straightforward fieldwork. Multiple laboratory developments have recently improved the cost efficiency of PCR-based library generation allowing parallel processing of large sample numbers. Drawbacks of amplicon library generation can be alleviated by alternative library preparation methods. By providing a consistent and efficient means of species identification, as well as insights into the functional diversity of such habitats, HTS-based metabarcoding and metagenomic studies will be a useful tool for assessing the ecological integrity and health of Antarctic habitats. When applied to large sample numbers, across large spatial scales and multiple biota, HTS-based metabarcoding and metagenomic approaches will improve our understanding of Antarctic terrestrial biodiversity on a continental scale.

Acknowledgements

The authors declare no competing interests. The Australian Antarctic Division provided funding under science project 2355 to M.S. The Australian Research Council supported this work through funds from linkage grant LP0991985 to A.C. and M.S. The University of Adelaide supported this project through an International Post-Graduate Research Scholarship to P.C. We thank two anonymous reviewers for helpful comments that improved the manuscript.

Author contribution

P.C. prepared, edited and revised the manuscript. M.S., L.C. and A.C. edited and revised the manuscript.

References

Ahn, J.H., Kim, B.Y., Song, J. & Weon, H.Y. 2012. Effects of PCR cycle number and DNA polymerase type on the 16S rRNA gene pyrosequencing analysis of bacterial communities. Journal of Microbiology, 50, 10711074.Google Scholar
Altermann, S., Leavitt, S.D., Goward, T., Nelsen, M.P. & Lumbsch, H.T. 2014. How do you solve a problem like Letharia? A new look at cryptic species in lichen-forming fungi using Bayesian clustering and SNPs from multilocus sequence data. PLoS ONE, 9, 10.1371/journal.pone.0097556.CrossRefGoogle Scholar
Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. 1990. Basic local alignment search tool. Journal of Molecular Biology, 215, 403410.Google Scholar
Anders, S. & Huber, W. 2010. Differential expression analysis for sequence count data. Genome Biology, 11, 10.1186/gb-2010-11-10-r106.CrossRefGoogle ScholarPubMed
Ascher, J., Ceccherini, M.T., Pantani, O.L., Agnelli, A., Borgogni, F., Guerri, G., Nannipieri, P. & Pietramellara, G. 2009. Sequential extraction and genetic fingerprinting of a forest soil metagenome. Applied Soil Ecology, 42, 10.1016/j.apsoil.2009.03.005.CrossRefGoogle Scholar
Aylagas, E., Borja, Á. & Rodríguez-Ezpeleta, N. 2014. Environmental status assessment using DNA metabarcoding: towards a genetics based marine biotic index (gAMBI). PLoS ONE, 9, 10.1371/journal.pone.0090529.Google Scholar
Bellemain, E., Davey, M.L., Kauserud, H., Epp, L.S., Boessenkool, S., Coissac, E., Geml, J., Edwards, M., Willerslev, E., Gussarova, G., Taberlet, P. & Brochmann, C. 2013. Fungal palaeodiversity revealed using high-throughput metabarcoding of ancient DNA from arctic permafrost. Environmental Microbiology, 15, 11761189.CrossRefGoogle ScholarPubMed
Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J. & Sayers, E.W. 2011. GenBank. Nucleic Acids Research, 39, 10.1093/nar/gkq1079.Google Scholar
Bik, H.M., Porazinska, D.L., Creer, S., Caporaso, J.G., Knight, R. & Thomas, W.K. 2012a. Sequencing our way towards understanding global eukaryotic biodiversity. Trends in Ecology & Evolution, 27, 233243.Google Scholar
Bik, H.M., Sung, W., De Ley, P., Baldwin, J.G., Sharma, J., Rocha-Olivares, A. & Thomas, W.K. 2012b. Metagenetic community analysis of microbial eukaryotes illuminates biogeographic patterns in deep-sea and shallow water sediments. Molecular Ecology, 21, 10481059.CrossRefGoogle ScholarPubMed
Binladen, J., Gilbert, M.T.P., Bollback, J.P., Panitz, F., Bendixen, C., Nielsen, R. & Willerslev, E. 2007. The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing. PLoS ONE, 2, 10.1371/journal.pone.0000197.CrossRefGoogle ScholarPubMed
Bintanja, R., Severijns, C., Haarsma, R. & Hazeleger, W. 2014. The future of Antarctica’s surface winds simulated by a high-resolution global climate model: 2. Drivers of 21st century changes. Journal of Geophysical Research - Atmospheres, 119, 71607178.Google Scholar
Blanchet, F.G., Legendre, P., Bergeron, J.A.C. & He, F.L. 2014. Consensus RDA across dissimilarity coefficients for canonical ordination of community composition data. Ecological Monographs, 84, 10.1890/13-0648.1.Google Scholar
Bockheim, J.G. 1997. Properties and classification of cold desert soils from Antarctica. Soil Science Society of America Journal, 61, 224231.Google Scholar
Bohmann, K., Evans, A., Gilbert, M.T.P., Carvalho, G.R., Creer, S., Knapp, M., Yu, D.W. & de Bruyn, M. 2014. Environmental DNA for wildlife biology and biodiversity monitoring. Trends in Ecology & Evolution, 29, 358367.Google Scholar
Bokulich, N.A., Subramanian, S., Faith, J.J., Gevers, D., Gordon, J.I., Knight, R., Mills, D.A. & Caporaso, J.G. 2013. Quality-filtering vastly improves diversity estimates from Illumina amplicon sequencing. Nature Methods, 10, 10.1038/NMETH.2276.Google Scholar
Bottos, E.M., Scarrow, J.W., Archer, S.D.J., McDonald, I.R. & Cary, S.C. 2014a. Bacterial community structures of Antarctic soils. In Cowan, D.A., ed. Antarctic terrestrial microbiology. Berlin: Springer, 933.Google Scholar
Bottos, E.M., Woo, A.C., Zawar-Reza, P., Pointing, S.B. & Cary, S.C. 2014b. Airborne bacterial populations above desert soils of the McMurdo Dry Valleys, Antarctica. Microbial Ecology, 67, 120128.Google Scholar
Boyer, F., Mercier, C., Bonin, A., Le Bras, Y., Taberlet, P. & Coissac, E. 2016. OBITools: a UNIX-inspired software package for DNA metabarcoding. Molecular Ecology Resources, 16, 176182.Google Scholar
Bragg, L.M., Stone, G., Butler, M.K., Hugenholtz, P. & Tyson, G.W. 2013. Shining a light on dark sequencing: characterising errors in ion torrent PGM data. PLoS Computational Biology, 9, 10.1371/journal.pcbi.1003031.CrossRefGoogle ScholarPubMed
Bybee, S.M., Bracken-Grissom, H., Haynes, B.D., Hermansen, R.A., Byers, R.L., Clement, M.J., Udall, J.A., Wilcox, E.R. & Crandall, K.A. 2011. Targeted amplicon sequencing (TAS): a scalable next-gen approach to multilocus, multitaxa phylogenetics. Genome Biology and Evolution, 3, 13121323.Google Scholar
Caporaso, J.G., Kuczynski, J., Stombaugh, J., et al. 2010. QIIME allows analysis of high-throughput community sequencing data. Nature Methods, 7, 335336.Google Scholar
Carpenter, E.J., Lin, S.J. & Capone, D.G. 2000. Bacterial activity in South Pole snow. Applied and Environmental Microbiology, 66, 10.1128/AEM.66.10.4514-4517.2000.Google Scholar
Caruso, T., Trokhymets, V., Bargagli, R. & Convey, P. 2013. Biotic interactions as a structuring force in soil communities: evidence from the micro-arthropods of an Antarctic moss model system. Oecologia, 172, 10.1007/s00442-012-2503-9.Google Scholar
CBOL Plant Working Group . 2009. A DNA barcode for land plants. Proceedings of the National Academy of Sciences of the United States of America, 106, 12 79412 797.Google Scholar
Chown, S.L., Clarke, A., Fraser, C.I., Cary, S.C., Moon, K.L. & McGeoch, M.A. 2015a. The changing form of Antarctic biodiversity. Nature, 522, 10.1038/nature14505.Google Scholar
Chown, S.L., Hodgins, K.A., Griffin, P.C., Oakeshott, J.G., Byrne, M. & Hoffmann, A.A. 2015b. Biological invasions, climate change and genomics. Evolutionary Applications, 8, 10.1111/eva.12234.Google Scholar
Chown, S., Huiskes, A.H.L., Gremmen, N.J.M., Lee, J.E., Terauds, A., Crosbie, K., Frenot, Y., Hughes, K.A., Imura, S., Kiefer, K., Lebouvier, M., Raymond, B., Tsujimoto, M., Ware, C., van de Vijver, B. & Bergstrom, D.M. 2012a. Continent-wide risk assessment for the establishment of nonindigenous species in Antarctica. Proceedings of the National Academy of Sciences of the United States of America, 109, 10.1073/pnas.1119787109.Google Scholar
Chown, S.L., Lee, J.E., Hughes, K.A., Barnes, J., Barrett, P.J., Bergstrom, D.M., Convey, P., Cowan, D.A., Crosbie, K., Dyer, G., Frenot, Y., Grant, S.M., Herr, D., Kennicutt, M.C., Lamers, M., Murray, A., Possingham, H.P., Reid, K., Riddle, M.J., Ryan, P.G., Sanson, L., Shaw, J.D., Sparrow, M.D., Summerhayes, C., Terauds, A. & Wall, D.H. 2012b. Challenges to the future conservation of the Antarctic. Science, 337, 158159.Google Scholar
Clarke, L.J., Soubrier, J., Weyrich, L.S. & Cooper, A. 2014a. Environmental metabarcodes for insects: in silico PCR reveals potential for taxonomic bias. Molecular Ecology Resources, 14, 11601170.CrossRefGoogle ScholarPubMed
Clarke, L.J., Czechowski, P., Soubrier, J., Stevens, M.I. & Cooper, A. 2014b. Modular tagging of amplicons using a single PCR for high-throughput sequencing. Molecular Ecology Resources, 14, 117121.Google Scholar
Cline, J., Braman, J.C. & Hogrefe, H.H. 1996. PCR fidelity of Pfu DNA polymerase and other thermostable DNA polymerases. Nucleic Acids Research, 24, 35463551.Google Scholar
Coissac, E., Riaz, T. & Puillandre, N. 2012. Bioinformatic challenges for DNA metabarcoding of plants and animals. Molecular Ecology, 21, 18341847.CrossRefGoogle ScholarPubMed
Colesie, C., Gommeaux, M., Green, T.G.A. & Bueddel, B. 2014. Biological soil crusts in Continental Antarctica: Garwood Valley, southern Victoria Land, and Diamond Hill, Darwin Mountains region. Antarctic Science, 26, 115123.Google Scholar
Convey, P. 1997. How are the life history strategies of Antarctic terrestrial invertebrates influenced by extreme environmental conditions? Journal of Thermal Biology, 22, 10.1016/S0306-4565(97)00062-4.Google Scholar
Convey, P. 2010. Terrestrial biodiversity in Antarctica – Recent advances and future challenges. Polar Science, 4, 135147.Google Scholar
Convey, P. & Stevens, M.I. 2007. Antarctic biodiversity. Science, 317, 18771878.Google Scholar
Convey, P., Stevens, M.I., Hodgson, D.A., Smellie, J.L., Hillenbrand, C.D., Barnes, D.K.A., Clarke, A., Pugh, P.J.A., Linse, K. & Cary, S.C. 2009. Exploring biological constraints on the glacial history of Antarctica. Quaternary Science Reviews, 28, 30353048.Google Scholar
Convey, P., Chown, S.L., Clarke, A., Barnes, D.K.A., Bokhorst, S., Cummings, V., Ducklow, H.W., Frati, F., Green, T.G.A., Gordon, S., Griffiths, H.J., Howard-Williams, C., Huiskes, A.H.L., Laybourn-Parry, J., Lyons, W.B., McMinn, A., Morley, S.A., Peck, L.S., Quesada, A., Robinson, S.A., Schiaparelli, S. & Wall, D.H. 2014. The spatial structure of Antarctic biodiversity. Ecological Monographs, 84, 203244.Google Scholar
Cowan, D.A., Ramond, J.B., Makhalanyane, T. & de Maayer, P. 2015. Metagenomics of extreme environments. Current Opinion in Microbiology, 25, 10.1016/j.mib.2015.05.005.Google Scholar
Czechowski, P., Clarke, L.J., Breen, J., Cooper, A. & Stevens, M.I. 2016. Antarctic eukaryotic soil diversity of the Prince Charles Mountains revealed by high-throughput sequencing. Soil Biology & Biochemistry, 95, 10.1016/j.soilbio.2015.12.013.Google Scholar
Dalén, L., Götherström, A., Meijer, T. & Shapiro, B. 2007. Recovery of DNA from footprints in the snow. Canadian Field-Naturalist, 121, 321324.Google Scholar
De Cárcer, D.A., Denman, S.E., McSweeney, C. & Morrison, M. 2011. Strategy for modular tagged high-throughput amplicon sequencing. Applied and Environmental Microbiology, 77, 63106312.Google Scholar
Deagle, B.E., Jarman, S.N., Coissac, E., Pompanon, F. & Taberlet, P. 2014. DNA metabarcoding and the cytochrome c oxidase subunit I marker: not a perfect match. Biology Letters, 10, 10.1098/rsbl.2014.0562.Google Scholar
Delmont, T.O., Simonet, P. & Vogel, T.M. 2013. Mastering methodological pitfalls for surviving the metagenomic jungle. BioEssays, 35, 744754.Google Scholar
Delmont, T.O., Robe, P., Cecillon, S., Clark, I.M., Constancias, F., Simonet, P., Hirsch, P.R. & Vogel, T.M. 2011. Accessing the soil metagenome for studies of microbial diversity. Applied and Environmental Microbiology, 77, 10.1128/AEM.01526-10.Google Scholar
Denonfoux, J., Parisot, N., Dugat-Bony, E., Biderre-Petit, C., Boucher, D., Morgavi, D.P., Le Paslier, D., Peyretaillade, E. & Peyret, P. 2013. Gene capture coupled to high-throughput sequencing as a strategy for targeted metagenome exploration. DNA Research, 20, 185196.Google Scholar
Dornelas, M., Gotelli, N.J., McGill, B., Shimadzu, H., Moyes, F., Sievers, C. & Magurran, A.E. 2014. Assemblage time series reveal biodiversity change but not systematic loss. Science, 344, 10.1126/science.1248484.CrossRefGoogle Scholar
Dreesens, L.L., Lee, C.K. & Cary, S.C. 2014. The distribution and identity of edaphic fungi in the McMurdo Dry Valleys. Biology, 3, 10.3390/biology3030466.Google Scholar
Drummond, A.J., Newcomb, R.D., Buckley, T.R., Xie, D., Dopheide, A., Potter, B.C.M., Heled, J., Ross, H.A., Tooman, L., Grosser, S., Park, D., Demetras, N.J., Stevens, M.I., Russell, J.C., Anderson, S.H., Carter, A. & Nelson, N. 2015. Evaluating a multigene environmental DNA approach for biodiversity assessment. GigaScience, 4, 10.1186/s13742-015-0086-1.Google Scholar
Edgar, R.C. 2010. Search and clustering orders of magnitude faster than BLAST. Bioinformatics, 26, 24602461.CrossRefGoogle ScholarPubMed
Edgar, R.C. 2013. UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nature Methods, 10, 996998.Google Scholar
Ellis, N., Smith, S.J. & Pitcher, C.R. 2012. Gradient forests: calculating importance gradients on physical predictors. Ecology, 93, 156168.Google Scholar
Epp, L.S., Boessenkool, S., Bellemain, E.P., Haile, J., Esposito, A., Riaz, T., Erseus, C., Gusarov, V.I., Edwards, M.E., Johnsen, A., Stenoien, H.K., Hassel, K., Kauserud, H., Yoccoz, N.G., Brathen, K., Willerslev, E., Taberlet, P., Coissac, E. & Brochmann, C. 2012. New environmental metabarcodes for analysing soil DNA: potential for studying past and present ecosystems. Molecular Ecology, 21, 18211833.Google Scholar
Ettema, C.H. & Wardle, D.A. 2002. Spatial soil ecology. Trends in Ecology & Evolution, 17, 177183.Google Scholar
Faircloth, B.C. & Glenn, T.C. 2012. Not all sequence tags are created equal: designing and validating sequence identification tags robust to indels. PLoS ONE, 7, 10.1371/journal.pone.0042543.Google Scholar
Faircloth, B.C., McCormack, J.E., Crawford, N.G., Harvey, M.G., Brumfield, R.T. & Glenn, T.C. 2012. Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales. Systematic Biology, 61, 10.1093/sysbio/sys004.Google Scholar
Fell, J.W., Scorzetti, G., Connell, L. & Craig, S. 2006. Biodiversity of micro-eukaryotes in Antarctic Dry Valley soils with <5% soil moisture. Soil Biology & Biochemistry, 38, 31073119.CrossRefGoogle Scholar
Feng, Y.-J., Liu, Q.-F., Chen, M.-Y., Liang, D. & Zhang, P. 2015. Parallel tagged amplicon sequencing of relatively long PCR products using the Illumina HiSeq platform and transcriptome assembly. Molecular Ecology Resources, 16, 10.1111/1755-0998.12429.Google Scholar
Fernández-Mendoza, F., Domaschke, S., García, M.A, Jordan, P., Martín, M.P. & Printzen, C. 2011. Population structure of mycobionts and photobionts of the widespread lichen Cetraria aculeata . Molecular Ecology, 20, 12081232.Google Scholar
Ficetola, G.F., Pansu, J., Bonin, A., Coissac, E., Giguet-Covex, C., De Barba, M., Gielly, L., Lopes, C.M., Boyer, F., Pompanon, F., Raye, G. & Taberlet, P. 2015. Replication levels, false presences and the estimation of the presence/absence from eDNA metabarcoding data. Molecular Ecology Resources, 15, 543556.Google Scholar
Fierer, N., Leff, J.W., Adams, B.J., Nielsen, U.N., Bates, S.T., Lauber, C.L., Owens, S., Gilbert, J.A., Wall, D.H. & Caporaso, J.G., 2012. Cross-biome metagenomic analyses of soil microbial communities and their functional attributes. Proceedings of the National Academy of Sciences of the United States of America, 109, 10.1073/pnas.1215210110.Google Scholar
Frenot, Y., Chown, S.L., Whinam, J., Selkirk, P.M., Convey, P., Skotnicki, M. & Bergstrom, D.M. 2005. Biological invasions in the Antarctic: extent, impacts and implications. Biological Reviews, 80, 4572.Google Scholar
Gagné, S.A. & Proulx, R. 2009. Accurate delineation of biogeographical regions depends on the use of an appropriate distance measure. Journal of Biogeography, 36, 561562.Google Scholar
Giardine, B., Riemer, C., Hardison, R.C., Burhans, R., Elnitski, L., Shah, P., Zhang, Y., Blankenberg, D., Albert, I., Taylor, J., Miller, W., Kent, W.J. & Nekrutenko, A. 2005. Galaxy: a platform for interactive large-scale genome analysis. Genome Research, 15, 14511455.Google Scholar
Gilbert, J.A, Meyer, F., Antonopoulos, D., Balaji, P., Brown, C.T., Desai, N., Eisen, J.A., Evers, D., Field, D., Feng, W., Huson, D., Jansson, J., Knight, R., Knight, J., Kolker, E., Konstantindis, K., Kostka, J., Kyrpides, N., Mackelprang, R., McHardy, A., Quince, C., Raes, J., Sczyrba, A., Shade, A. & Stevens, R. 2010. Meeting report: the terabase metagenomics workshop and the vision of an Earth microbiome project. Standards in Genomic Sciences, 3, 243248.Google Scholar
Glenn, T.C. 2011. Field guide to next-generation DNA sequencers. Molecular Ecology Resources, 11, 10.1111/j.1755-0998.2011.03024.x.Google Scholar
Gnirke, A., Melnikov, A., Maguire, J., Rogov, P., LeProust, E.M., Brockman, W., Fennell, T., Giannoukos, G., Fisher, S., Russ, C., Gabriel, S., Jaffe, D.B., Lander, E.S. & Nusbaum, C. 2009. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nature Biotechnology, 27, 182189.CrossRefGoogle ScholarPubMed
Gokul, J.K., Valverde, A., Tuffin, M., Cary, S.C. & Cowan, D.A. 2013. Micro-eukaryotic diversity in hypolithons from Miers Valley, Antarctica. Biology, 2, 331340.Google Scholar
Goordial, J., Davila, A., Greer, C.W., Cannam, R., DiRuggiero, J., McKay, C.P. & Whyte, L.G. 2016. Comparative activity and functional ecology of permafrost soils and lithic niches in a hyper-arid polar desert. Environmental Microbiology, 10.1111/1462-2920.13353.Google Scholar
Gutt, J., Zurell, D., Bracegridle, T.J., Cheung, W., Clark, M.S., Convey, P., Danis, B., David, B., De Broyer, C., di Prisco, G., Griffiths, H., Laffont, R., Peck, L.S., Pierrat, B., Riddle, M.J., Saucede, T., Turner, J., Verde, C., Wang, Z.M. & Grimm, V. 2012. Correlative and dynamic species distribution modelling for ecological predictions in the Antarctic: a cross-disciplinary concept. Polar Research, 31, 10.3402/polar.v31i0.11091.Google Scholar
Hajibabaei, M., Spall, J.L., Shokralla, S. & van Konynenburg, S. 2012. Assessing biodiversity of a freshwater benthic macroinvertebrate community through non-destructive environmental barcoding of DNA from preservative ethanol. BMC Ecology, 12, 10.1186/1472-6785-12-28.Google Scholar
Hamming, R.W. 1950. Error detecting and error correcting codes. Bell System Technical Journal, 29, 10.1002/j.1538-7305.1950.tb00463.x.Google Scholar
Headland, R.K. 2009. A chronology of Antarctic exploration, 2nd ed. Bernard Quaritch, 722 pp.Google Scholar
Hogg, I.D., Cary, S.C., Convey, P., Newsham, K.K., O’Donnell, A.G., Adams, B.J., Aislabie, J., Frati, F., Stevens, M.I. & Wall, D.H. 2006. Biotic interactions in Antarctic terrestrial ecosystems: are they a factor? Soil Biology & Biochemistry, 38, 30353040.Google Scholar
Howard-Williams, C., Peterson, D., Lyons, W.B., Cattaneo-Vietti, R. & Gordon, S. 2006. Measuring ecosystem response in a rapidly changing environment: the Latitudinal Gradient Project. Antarctic Science, 18, 465471.Google Scholar
Hughes, K.A. & Convey, P. 2010. The protection of Antarctic terrestrial ecosystems from inter- and intra-continental transfer of non-indigenous species by human activities: a review of current systems and practices. Global Environmental Change - Human and Policy Dimensions, 20, 10.1016/j.gloenvcha.2009.09.005.Google Scholar
Hughes, K.A. & Convey, P. 2012. Determining the native/non-native status of newly discovered terrestrial and freshwater species in Antarctica – current knowledge, methodology and management action. Journal of Environmental Management, 93, 5266.Google Scholar
Hughes, K.A., Convey, P., Maslen, N.R. & Smith, R.I.L. 2010. Accidental transfer of non-native soil organisms into Antarctica on construction vehicles. Biological Invasions, 12, 875891.Google Scholar
Hui, F.K.C., Taskinen, S., Pledger, S., Foster, S.D. & Warton, D.I. 2015. Model-based approaches to unconstrained ordination. Methods in Ecology and Evolution, 6, 10.1111/2041-210X.12236.Google Scholar
Huson, D.H. & Weber, N. 2013. Microbial community analysis using MEGAN. Microbial Metagenomics, Metatranscriptomics, and Metaproteomics, 531, 10.1016/B978-0-12-407863-5.00021-6.Google Scholar
Jarman, S.N., McInnes, J.C., Faux, C., Polanowski, A.M., Marthick, J., Deagle, B.E., Southwell, C. & Emmerson, L. 2013. Adélie penguin population diet monitoring by analysis of food DNA in scats. PLoS ONE, 8, 10.1371/journal.pone.0082227.CrossRefGoogle ScholarPubMed
Jungblut, A.D., Vincent, W.F. & Lovejoy, C. 2012. Eukaryotes in Arctic and Antarctic cyanobacterial mats. FEMS Microbiology Ecology, 82, 416428.Google Scholar
Kanagawa, T. 2003. Bias and artifacts in multitemplate polymerase chain reactions (PCR). Journal of Bioscience and Bioengineering, 96, 317323.Google Scholar
Kennicutt, M.C., Chown, S.L., Cassano, J.J., Liggett, D., Massom, R., Peck, L.S., Rintoul, S.R., Storey, J.W.V., Vaughan, D.G., Wilson, T.J. & Sutherland, W.J. 2014. Polar research: six priorities for Antarctic science. Nature, 512, 10.1038/512023a.Google Scholar
Kennicutt, M.C., Chown, S.L., Cassano, J.J. et al. 2015. A roadmap for Antarctic and Southern Ocean science for the next two decades and beyond. Antarctic Science, 27, 10.1017/S0954102014000674.Google Scholar
Khan, N., Tuffin, M., Stafford, W., Cary, C., Lacap, D.C., Pointing, S.B. & Cowan, D. 2011. Hypolithic microbial communities of quartz rocks from Miers Valley, McMurdo Dry Valleys, Antarctica. Polar Biology, 34, 16571668.Google Scholar
Kircher, M., Sawyer, S. & Meyer, M. 2012. Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Research, 40, 10.1093/nar/gkr771.Google Scholar
Kwok, S. 1990. Procedures to minimize PCR-product carry-over. In Innis, M.A., Gelfand, D.H. & Sninsky, J.J., eds. PCR protocols: a guide to methods and applications. San Diego, CA: Academic Press, 142145.Google Scholar
Lan, Y., Wang, Q., Cole, J.R. & Rosen, G.L. 2012. Using the RDP classifier to predict taxonomic novelty and reduce the search space for finding novel organisms. PLoS ONE, 7, 10.1371/journal.pone.0032491.Google Scholar
Lawley, B., Ripley, S., Bridge, P. & Convey, P. 2004. Molecular analysis of geographic patterns of eukaryotic diversity in Antarctic soils. Applied and Environmental Microbiology, 70, 59635972.Google Scholar
Lee, C.K., Barbier, B.A., Bottos, E.M., McDonald, I.R. & Cary, S.C. 2012a. The Inter-Valley Soil Comparative Survey: the ecology of Dry Valley edaphic microbial communities. ISME Journal, 6, 10.1038/ismej.2011.170.Google Scholar
Lee, C.K., Herbold, C.W., Polson, S.W., Wommack, K.E., Williamson, S.J., McDonald, I.R. & Cary, S.C. 2012b. Groundtruthing next-gen sequencing for microbial ecology – biases and errors in community structure estimates from PCR amplicon pyrosequencing. PLoS ONE, 7, 10.1371/journal.pone.0044224.Google Scholar
Legendre, P. & Andersson, M.J. 1999. Distance-based redundancy analysis: testing multispecies responses in multifactorial ecological experiments. Ecological Monographs, 69, 10.1890/0012-9615(1999)069[0001:DBRATM]2.0.CO;2.Google Scholar
Lemmon, A.R., Emme, S.A. & Lemmon, E.M. 2012. Anchored hybrid enrichment for massively high-throughput phylogenomics. Systematic Biology, 61, 10.1093/sysbio/sys049.Google Scholar
Lenz, T.L. & Becker, S. 2008. Simple approach to reduce PCR artefact formation leads to reliable genotyping of MHC and other highly polymorphic loci – implications for evolutionary analysis. Gene, 427, 117123.Google Scholar
Leray, M., Yang, J.Y., Meyer, C.P., Mills, S.C., Agudelo, N., Ranwez, V., Boehm, J.T. & Machida, R.J. 2013. A new versatile primer set targeting a short fragment of the mitochondrial COI region for metabarcoding metazoan diversity: application for characterizing coral reef fish gut contents. Frontiers in Zoology, 10, 10.1186/1742-9994-10-34.Google Scholar
Lindgreen, S. 2012. AdapterRemoval: easy cleaning of next generation sequencing reads. BMC Research Notes, 5, 10.1186/1756-0500-5-337.Google Scholar
Lindgreen, S., Adair, K.L. & Gardner, P.P. 2016. An evaluation of the accuracy and speed of metagenome analysis tools. Scientific Reports, 6, 19233, 10.1038/srep19233.Google Scholar
Liu, S., Li, Y.Y., Lu, J.L., Su, X., Tang, M., Zhang, R., Zhou, L.L., Zhou, C.R., Yang, Q., Ji, Y.Q., Yu, D.W. & Zhou, X. 2013. SOAPBarcode: revealing arthropod biodiversity through assembly of Illumina shotgun sequences of PCR amplicons. Methods in Ecology and Evolution, 4, 11421150.CrossRefGoogle Scholar
Lohse, M., Bolger, A.M., Nagel, A., Fernie, A.R., Lunn, J.E., Stitt, M. & Usadel, B. 2012. RobiNA: A user-friendly, integrated software solution for RNA-Seq-based transcriptomics. Nucleic Acids Research, 40, W622W627.CrossRefGoogle Scholar
López-Bueno, A., Tamames, J., Velázquez, D., Moya, A., Quesada, A. & Alcamí, A. 2009. High diversity of the viral community from an Antarctic lake. Science, 326, 858861.Google Scholar
Machida, R.J. & Knowlton, N. 2012. PCR primers for metazoan nuclear 18S and 28S ribosomal DNA sequences. PLoS ONE, 7, 10.1371/journal.pone.0046180.Google Scholar
Magalhaes, C., Stevens, M.I., Cary, S.C., Ball, B.A., Storey, B.C., Wall, D.H., Tuerk, R. & Ruprecht, U. 2012. At limits of life: multidisciplinary insights reveal environmental constraints on biotic diversity in Continental Antarctica. PLoS ONE, 7, 10.1371/journal.pone.0044578.Google Scholar
Makhalanyane, T.P., Valverde, A., Birkeland, N.K., Cary, S.C., Tuffin, I.M. & Cowan, D.A. 2013. Evidence for successional development in Antarctic hypolithic bacterial communities. ISME Journal, 7, 20802090.Google Scholar
Marchant, D.R. & Head, J.W. 2007. Antarctic dry valleys: microclimate zonation, variable geomorphic processes, and implications for assessing climate change on Mars. Icarus, 192, 187222.CrossRefGoogle Scholar
McGaughran, A., Stevens, M.I., Hogg, I.D. & Carapelli, A. 2011. Extreme glacial legacies: a synthesis of the Antarctic springtail phylogeographic record. Insects, 2, 6282.Google Scholar
McGeoch, M.A., Shaw, J.D., Terauds, A., Lee, J.E. & Chown, S.L. 2015. Monitoring biological invasion across the broader Antarctic: a baseline and indicator framework. Global Environmental Change - Human and Policy Dimensions, 32, 108125.Google Scholar
McMurdie, P.J. & Holmes, S. 2013. Phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data. PLoS ONE, 8, 10.1371/journal.pone.0061217.CrossRefGoogle Scholar
McMurdie, P.J. & Holmes, S. 2014. Waste not, want not: why rarefying microbiome data is inadmissible. PLoS Computational Biology, 10, 10.1371/journal.pcbi.1003531.Google Scholar
Meyer, M., Stenzel, U. & Hofreiter, M. 2008. Parallel tagged sequencing on the 454 platform. Nature Protocols, 3, 267278.Google Scholar
Nakai, R., Abe, T., Baba, T., Imura, S., Kagoshima, H., Kanda, H., Kohara, Y., Koi, A., Niki, H., Yanagihara, K. & Naganuma, T. 2012. Eukaryotic phylotypes in aquatic moss pillars inhabiting a freshwater lake in East Antarctica, based on 18S rRNA gene analysis. Polar Biology, 35, 14951504.Google Scholar
Niederberger, T.D., Sohm, J.A., Gunderson, T.E., Parker, A.E., Tirindelli, J., Capone, D.G., Carpenter, E.J. & Cary, S.C. 2015. Microbial community composition of transiently wetted Antarctic Dry Valley soils. Frontiers in Microbiology, 6, 10.3389/fmicb.2015.00009.Google Scholar
Nielsen, U.N. & Wall, D.H. 2013. The future of soil invertebrate communities in polar regions: different climate change responses in the Arctic and Antarctic? Ecology Letters, 16, 409419.Google Scholar
O’Neill, E.M., Schwartz, R., Bullock, C.T., Williams, J.S., Shaffer, H.B., Aguilar-Miguel, X., Parra-Olea, G. & Weisrock, D.W. 2013. Parallel tagged amplicon sequencing reveals major lineages and phylogenetic structure in the North American tiger salamander (Ambystoma tigrinum) species complex. Molecular Ecology, 22, 111129.Google Scholar
Oksanen, J., Blanchet, F.G., Kindt, R., Legendre, P., Minchin, P.R., O’Hara, R.B., Simpson, G.L., Solymos, P., Stevens, M.H.H. & Wagner, H. 2015. . Vegan: community ecology package. Available at: https://cran.r-project.org/package=vegan.Google Scholar
Orgiazzi, A., Bianciotto, V., Bonfante, P., Daghino, S., Ghignone, S., Lazzari, A., Lumini, E., Mello, A., Napoli, C., Perotto, S., Vizzini, A., Bagella, A., Murat, C. & Girlanda, M. 2013. 454 pyrosequencing analysis of fungal assemblages from geographically distant, disparate soils reveals spatial patterning and a core mycobiome. Diversity, 5, 7398.Google Scholar
Pedersen, M.W., Overballe-Petersen, S., Ermini, L., Sarkissian, C.D., Haile, J., Hellstrom, M., Spens, J., Thomsen, P.F., Bohmann, K., Cappellini, E., Schnell, I.B., Wales, N.A., Caroe, C., Campos, P.F., Schmidt, A.M.Z., Gilbert, M.T.P., Hansen, A.J., Orlando, L. & Willerslev, E. 2014. Ancient and modern environmental DNA. Philosophical Transactions of the Royal Society - Biological Sciences, B370, 10.1098/rstb.2013.0383.Google Scholar
Pointing, S.B. & Belnap, J. 2012. Microbial colonization and controls in dryland systems. Nature Reviews Microbiology, 10, 551562.Google Scholar
Powell, S.M., Bowman, J.P., Snape, I. & Stark, J.S. 2003. Microbial community variation in pristine and polluted nearshore Antarctic sediments. FEMS Microbiology Ecology, 45, 10.1016/S0168-6496(03)00135-1.Google Scholar
Pruesse, E., Quast, C., Knittel, K., Fuchs, B.M., Ludwig, W.G., Peplies, J. & Gloeckner, F.O. 2007. SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Research, 35, 71887196.Google Scholar
R Development Team . 2016. R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. Available at: http://www.r-project.org/.Google Scholar
Ratnasingham, S. & Hebert, P.D.N. 2007. BOLD: the barcode of life data system (www.barcodinglife.org). Molecular Ecology Notes, 7, 355364.Google Scholar
Riaz, T., Shehzad, W., Viari, A., Pompanon, F., Taberlet, P. & Coissac, E. 2011. ecoPrimers: inference of new DNA barcode markers from whole genome sequence analysis. Nucleic Acids Research, 39, 10.1093/nar/gkr732.Google Scholar
Robertson, C.E., Harris, J.K., Wagner, B.D., Granger, D., Browne, K., Tatem, B., Feazel, L.M., Park, K., Pace, N.R. & Frank, D.N. 2013. Explicet: graphical user interface software for metadata-driven management, analysis and visualization of microbiome data. Bioinformatics, 29, 31003101.Google Scholar
Robinson, M.D., McCarthy, D.J. & Smyth, G.K. 2009. edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 26, 139140.Google Scholar
Roesch, L.F.W., Fulthorpe, R.R., Pereira, A.B., Pereira, C.K., Lemos, L.N., Barbosa, A.D., Suleiman, A.K.A., Gerber, A.L., Pereira, M.G., Loss, A. & da Costa, E.M. 2012. Soil bacterial community abundance and diversity in ice-free areas of Keller Peninsula, Antarctica. Applied Soil Ecology, 61, 715.Google Scholar
Rogers, A.D. 2007. Evolution and biodiversity of Antarctic organisms: a molecular perspective. Philosophical transactions of the Royal Society - Biological sciences, B362, 21912214.Google Scholar
Saiki, R.K., Gelfand, D.H., Stoffel, S., Scharf, S.J., Higuchi, R., Horn, G.T., Mullis, K.B. & Erlich, H.A. 1988. Primer-directed enzymatic amplification of DNA with a thermostable DNA-polymerase. Science, 239, 487491.Google Scholar
Salter, S.J., Cox, M.J., Turek, E.M., Calus, S.T., Cookson, W.O., Moffatt, M.F., Turner, P., Parkhill, J., Loman, N.J. & Walker, A.W. 2014. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biology, 12, 10.1186/s12915-014-0087-z.Google Scholar
Schloss, P.D., Westcott, S.L., Ryabin, T., Hall, J.R., Hartmann, M., Hollister, E.B., Lesniewski, R.A, Oakley, B.B., Parks, D.H., Robinson, C.J., Sahl, J.W., Stres, B., Thallinger, G.G., van Horn, D.J. & Weber, C.F. 2009. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Applied and Environmental Microbiology, 75, 75377541.Google Scholar
Shaw, J.D., Terauds, A., Riddle, M.J., Possingham, H.P. & Chown, S.L. 2014. Antarctica’s protected areas are inadequate, unrepresentative, and at risk. PLoS Biology, 12, 10.1371/journal.pbio.1001888.Google Scholar
Shokralla, S., Singer, G.A.C. & Hajibabaei, M. 2010. Direct PCR amplification and sequencing of specimens’ DNA from preservative ethanol. BioTechniques, 48, 10.2144/000113362.Google Scholar
Sipos, R., Székely, A.J., Palatinszky, M., Révész, S., Márialigeti, K. & Nikolausz, M. 2007. Effect of primer mismatch, annealing temperature and PCR cycle number on 16S rRNA gene-targeting bacterial community analysis. FEMS Microbiology Ecology, 60, 10.1111/j.1574-6941.2007.00283.x.Google Scholar
Smith, D.P. & Peay, K.G. 2014. Sequence depth, not PCR replication, improves ecological inference from next generation DNA sequencing. PLoS ONE, 9, 10.1371/journal.pone.0090234.Google Scholar
Stiller, M., Knapp, M., Stenzel, U., Hofreiter, M. & Meyer, M. 2009. Direct multiplex sequencing (DMPS) – a novel method for targeted high-throughput sequencing of ancient and highly degraded DNA. Genome Research, 19, 18431848.Google Scholar
Taberlet, P., Coissac, E., Pompanon, F., Brochmann, C. & Willerslev, E. 2012a. Towards next-generation biodiversity assessment using DNA metabarcoding. Molecular Ecology, 21, 20452050.Google Scholar
Taberlet, P., Prud’Homme, S.M., Campione, E., Roy, J., Miquel, C., Shehzad, W., Gielly, L., Rioux, D., Choler, P., Clement, J.C., Melodelima, C., Pompanon, F. & Coissac, E. 2012b. Soil sampling and isolation of extracellular DNA from large amount of starting material suitable for metabarcoding studies. Molecular Ecology, 21, 18161820.Google Scholar
Tang, C.Q., Leasi, F., Obertegger, U., Kieneke, A., Barraclough, T.G. & Fontaneto, D. 2012. The widely used small subunit 18S rDNA molecule greatly underestimates true diversity in biodiversity surveys of the meiofauna. Proceedings of the National Academy of Sciences of the United States of America, 109, 16 20816 212.Google Scholar
Terauds, A., Chown, S.L., Morgan, F., Peat, H.J., Watts, D.J., Keys, H., Convey, P. & Bergstrom, D.M. 2012. Conservation biogeography of the Antarctic. Diversity and Distributions, 18, 726741.Google Scholar
Turrill, W.B. 1938. The expansion of taxonomy with special reference to spermatophyta. Biological Reviews, 13, 10.1111/j.1469-185X.1938.tb00522.x.Google Scholar
Van Dijk, E.L., Auger, H., Jaszczyszyn, Y. & Thermes, C. 2014. Ten years of next-generation sequencing technology. Trends in Genetics, 30, 10.1016/j.tig.2014.07.001.Google Scholar
Van Rossum, G. & Drake, F.L. 1995. Python tutorial. Amsterdam: Centrum voor Wiskunde en Informatica.Google Scholar
Velasco-Castrillón, A. & Stevens, M.I. 2014. Morphological and molecular diversity at a regional scale: a step closer to understanding Antarctic nematode biogeography. Soil Biology & Biochemistry, 70, 272284.Google Scholar
Velasco-Castrillón, A., Gibson, J.A.E. & Stevens, M.I. 2014a. A review of current Antarctic limno-terrestrial microfauna. Polar Biology, 37, 15171531.CrossRefGoogle Scholar
Velasco-Castrillón, A., Page, T.J., Gibson, J.A.E. & Stevens, M.I. 2014b. Surprisingly high levels of biodiversity and endemism amongst Antarctic rotifers uncovered with mitochondrial DNA. Biodiversity, 15, 130142.Google Scholar
Velasco-Castrillón, A., Schultz, M.B., Colombo, F., Gibson, J.A.E., Davies, K.A, Austin, A.D. & Stevens, M.I. 2014c. Distribution and diversity of soil microfauna from East Antarctica: assessing the link between biotic and abiotic factors. PLoS ONE, 9, 10.1371/journal.pone.0087529.Google Scholar
Wang, Y., Naumann, U., Wright, S.T. & Warton, D.I. 2012. mvabund – an R package for model-based analysis of multivariate abundance data. Methods in Ecology and Evolution, 3, 10.1111/j.2041-210X.2012.00190.x.CrossRefGoogle Scholar
Whittaker, R.H. 1960. Vegetation of the Siskiyou Mountains, Oregon and California. Ecological Monographs, 30, 10.2307/1943563.Google Scholar
Wilke, A., Bischof, J., Gerlach, W., Glass, E., Harrison, T., Keegan, K.P., Paczian, T., Trimble, W.L., Bagchi, S., Gram, A., Chaterji, S. & Meyer, F. 2016. The MG-RAST metagenomics database and portal in 2015. Nucleic Acids Research, 44, 10.1093/nar/gkv1322.Google Scholar
Willerslev, E., Hansen, A.J. & Poinar, H.N. 2004. Isolation of nucleic acids and cultures from fossil ice and permafrost. Trends in Ecology & Evolution, 19, 10.1016/j.tree.2003.11.010.Google Scholar
Wish, M. & Carroll, J.D. 1982. Multidimensional scaling and its applications. In Krishnaiah, P.R. & Kanal, L.N., eds. Handbook of statistics 2. North-Holland: Elsevier, 317345.Google Scholar
Wu, T.H., Ayres, E., Bardgett, R.D., Wall, D.H. & Garey, J.R. 2011. Molecular study of worldwide distribution and diversity of soil animals. Proceedings of the National Academy of Sciences of the United States of America, 108, 17 72017 725.Google Scholar
Zhan, A.B., Bailey, S.A., Heath, D.D. & Macisaac, H.J. 2014. Performance comparison of genetic markers for high-throughput sequencing-based biodiversity assessment in complex communities. Molecular Ecology Resources, 14, 10.1111/1755-0998.12254.Google Scholar
Zhou, X.F. & Rokas, A. 2014. Prevention, diagnosis and treatment of high-throughput sequencing data pathologies. Molecular Ecology, 23, 10.1111/mec.12680.Google Scholar
Figure 0

Fig. 1 Workflow for metabarcoding analyses, which can be applied to soil, snow, ice, cryconite holes, lake sediments or nearshore marine environments. a. Samples are collected. b. The genetic material is extracted in bulk from individual samples. c. DNA contained in extracts is amplified with genetic markers and sequencing adapters, multiplex identifier (MID) tags are added. d. The library is processed on a high-throughput sequencing device. e. After data deconvolution according to sample, reference information assigns individual sequences or sequence clusters with taxonomic information. f. Distributional information becomes available. Picture of sequencing device provided courtesy of Illumina (San Diego, CA, USA). Base layers courtesy of the Scientific Committee on Antarctic Research Antarctic Digital Database.

Figure 1

Table I Selection of analysis software for metabarcoding data of environmental DNA. Possible tasks related to handling of metabarcoding data provided in columns. Multi-purpose tools in top section are suitable for various sequence analysis tasks. Software environments in the middle section are specialized for metabarcoding analysis. Programs and packages in the bottom section are focussed on ecological analysis and/or visualizing results. X=functionality of a given tool for a given task is provided, (X)=functionality of a given tool for a given task is provided to some extent, -=functionality is not provided.