Ready-to-use public infrastructure for global SARS-CoV-2 monitoring

To the Editor — The COVID-19 pandemic is the first health crisis characterized by large amounts of genomic data¹. Computational infrastructure can be a bottleneck for data analysis, amplifying global inequalities in ability to track SARS-CoV-2 evolution. This is an issue even in developed countries, as computational infrastructure requires expertise in resource procurement, configuration and maintenance. Commercial computational clouds do not fully address the problem because these resources must still be configured and funded. Furthermore, commercial clouds are predominantly US-based and many countries have policies making payments to foreign providers impractical. In developing countries, research computing infrastructure is rare and researchers often cannot afford commercial cloud-based computation. Here, we present the COVID-19 effort by the Galaxy Project, which pools free worldwide public computational infrastructure, making the analysis of deep sequencing data accessible to anyone while also providing an analytical framework for global pathogen genomic surveillance based on raw sequencing-read data.

Despite the existence of well designed and validated SARS-CoV-2 data analysis approaches^2,3, the ad hoc⁴ nature of their application often complicates the integration and comparison of analysis results. Public computational infrastructure (XSEDE, ELIXIR and Nectar Cloud in the United States, European Union and Australia, respectively) coupled with existing open-source software offers a solution to SARS-CoV-2 analytics challenges. However, glue is required to bind these resources into a unified platform for managing users, allocating storage and pairing analysis tools with appropriate computational resources. Such a platform is best not developed by a single principal investigator, group or institution, but rather supported by an international community of users, developers and educators.

We have developed a two-stage platform (Fig. 1) housed on three public Galaxy instances⁵ in the United States (http://usegalaxy.org), the European Union (http://usegalaxy.eu) and Australia (http://usegalaxy.org.au) and capable of supporting hundreds of thousands of complex analyses per month. Anyone can run effectively unlimited computation with 250 Gb (expandable) of disk space. The COVID-19 Galaxy Project comprises two stages (Fig. 1): the software components of stage 1—mature utilities for quality control, mapping, assembly and allelic variant (AV) calling—run entirely in Galaxy and are distributed via the BioConda project⁶; the software components of stage 2 are snippets of code for data transformation, exploration and visualization running within standard web-browser-based notebook environments. Stage 1 produces variant lists whereas stage 2 uses notebooks to perform descriptive analyses of datasets. In addition, an interactive dashboard is available that tracks temporal AV dynamics. (See https://covid19.galaxyproject.org for data, workflows, notebooks, dashboard and our ongoing automated tracking of large-scale genomic surveillance projects.)

**Fig. 1: Analysis flow for calling SARS-CoV-2 variants using Galaxy.**

ONT, Oxford Nanopore Technologies; VCF, variant call format; TSV, tab-separated values; PE, paired end; SE, single end. For more information, see https://covid19.galaxyproject.org.

Four primary analysis workflows (Supplementary Table 1) support the identification of SARS-CoV-2 AVs from deep-sequencing reads via the production of annotated AVs through a series of steps including quality control, trimming, mapping, deduplication, AV calling and filtering. Their output is processed by the Reporting and Consensus workflows (Supplementary Table 1) to generate standardized data tables describing AVs along with consensus genome sequences. These are further processed to summarize and visualize the data using interactive notebooks.

To illustrate the platform’s utility and scalability, we refer the reader to two large SARS-CoV-2 Illumina datasets (PRJNA622837, 619 samples from early SARS-CoV-2 transmission in the Boston area⁷; and PRJEB37886, ~100,000 samples analyzed as of the time of writing from the COVID-19 Genomics UK (COG-UK) genomic surveillance effort⁸) detailed in Supplementary Tables 1–3 and Supplementary Figs. 1–3. Analysis on COVID-19 Galaxy Project resources provides insights into co-occurrence patterns, presence of mutations defining variants of concern (https://cov-lineages.github.io/lineages-website/global_report.html), and intersection with sites under selection, including non-random associations among common low-frequency AVs that may reflect shared intra-host dynamics (Supplementary Fig. 1 and Supplementary Table 2). It can also highlight the emergence of mutations interfering with binding of polyclonal antibodies⁹ (for example, COG-UK data in Supplementary Fig. 2), suggesting possible intra-host dynamics. These and other interactive notebooks and dashboards on the platform could identify AVs that warrant closer monitoring as the pandemic continues.

Our system is designed to encourage scalable collaborative worldwide genomic surveillance to identify and respond to emerging variants. By relying on raw read data rather than assembled genomes and allowing every result to be traced back to its raw data, it goes a step beyond current surveillance efforts. Specifically, it enables surveillance of intra-patient minor AV frequencies—a distinction that could yield early warnings of epidemiological conditions conducive to the emergence of variants with altered pathogenicity, vaccine sensitivity or drug resistance.

References

1.
Hodcroft, E. B. et al. Nature 591, 30–33 (2021).
CAS Article Google Scholar
2.
Quick, J. et al. Nat. Protoc. 12, 1261–1276 (2017).
CAS Article Google Scholar
3.
Grubaugh, N. D. et al. Genome Biol. 20, 8 (2019).
Article Google Scholar
4.
Baker, D. et al. PLoS Pathog. 16, e1008643 (2020).
CAS Article Google Scholar
5.
Jalili, V. etal. Nucleic Acids Res. 48 W1, W395–W402 (2020).
6.
Grüning, B. et al. Nat. Methods 15, 475–476 (2018).
Article Google Scholar
7.
Lemieux, J. et al. Science https://doi.org/10.1126/science.abe3261 (2021).
8.
du Plessis, L. et al. Science 371, 708–712 (2021).
Article Google Scholar
9.
Greaney, A. J. et al. Cell Host Microbe 29, 463–476.e6 (2021).
CAS Article Google Scholar

Download references

Acknowledgements

The authors are grateful to the broader Galaxy community for their support and software development efforts. This work is funded by NIH grants U41 HG006620 and NSF ABI grant 1661497. Usegalaxy.eu is supported by the German Federal Ministry of Education and Research grants 031L0101C and de.NBI-epi to B.G. Galaxy and HyPhy integration is supported by NIH grant R01 AI134384 to A.N. Usegalaxy.org.au is supported by Bioplatforms Australia and the Australian Research Data Commons through funding from the Australian Government National Collaborative Research Infrastructure Strategy. The hyphy.org development team is supported by NIH grant R01GM093939. Usegalaxy.be is supported by the Research Foundation-Flanders (FWO) grant I002919N and the Flemish Supercomputer Center (VSC). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information

Affiliations

University of Freiburg, Freiburg, Germany
Wolfgang Maier, Simon Bray, Milad Miladi & Björn Grüning
The Pennsylvania State University, University Park, PA, USA
Marius van den Beek, Dave Bouvier, Nathan Coraor & Anton Nekrutenko
GalaxyWorks Inc, Baltimore, MD, USA
Babita Singh & Jordi Rambla De Argila
Centre for Genomic Regulation, Viral Beacon Project, Barcelona, Spain
Dannon Baker
Johns Hopkins University, Baltimore, MD, USA
Nathan Roach
University of Melbourne, Melbourne, Victoria, Australia
Simon Gladman & Andrew Lonie
Ghent University, Ghent, Belgium
Frederik Coppens
VIB Center for Plant Systems Biology, Ghent, Belgium
Frederik Coppens
University of Cape Town, Cape Town, South Africa
Darren P. Martin
Temple University, Philadelphia, PA, USA
Sergei L. Kosakovsky Pond

Corresponding authors

Correspondence to Björn Grüning or Sergei L. Kosakovsky Pond or Anton Nekrutenko.

Additional information

Peer review information Nature Biotechnology thanks Jason Sahl for their contribution to the peer review of this work.

Supplementary information

About this article

Note: This article have been indexed to our site. We do not claim ownership or copyright of any of the content above. To see the article at original source Click Here

Dark MAGA Hat,Black MAGA Hat 2024 Blackout MAGA Hat Trump Hats Never Surrender All Black on Black Hat

(14)

$6.99 (as of November 6, 2024 18:48 GMT +00:00 - )

Mio Angle Bracelet Making Kit for Beginner 5000Pcs Preppy Polymer Friendship Bracelet Making with Charms Kit for Jewelry Making DIY Arts and Crafts Birthday Gifts Toys for Kids Age 6-13

(22)

$6.99 (as of November 6, 2024 18:48 GMT +00:00 - )

Yuzcxxx Musk's MAGA Hat Donald Trump 45-47 Never Surrender MAGA Hat Trump 2024 USA Embroidered Adjustable Baseball Cap Hats Black

$3.60 (as of November 6, 2024 18:48 GMT +00:00 - )

Schylling NeeDoh Dream Drop - Sensory Toy with Groovy Goo Filling - 3" Tall - Color May Vary (Pack of 1)

(5140)

$5.95 (as of November 6, 2024 18:51 GMT +00:00 - )

Countertop Ice Maker, Ice Maker Machine 6 Mins 9 Ice, 26.5lbs/24Hrs, Portable Ice Maker Machine with Self-Cleaning, Ice Scoop, and Basket, Compact Ice Maker for Home/Kitchen/Office/Party

(5407)

$89.99 (as of November 6, 2024 18:44 GMT +00:00 - )

Index Of News Author

Science and Medical

Elliptic Curve ‘Murmurations’ Found With AI Take Flight

Almost immediately, the preprint garnered interest, particularly from Andrew Sutherland, a research scientist at MIT who is one of the managing editors of the LMFDB. Sutherland realized that 3 million elliptic curves weren’t enough for his purposes. He wanted to look at much larger conductor ranges to see how robust the murmurations were. He pulled

March 5, 2024

Science and Medical

Ancient Snake and Centipede Carvings in South America Are among World’s Largest Rock Engravings

Enormous engraved rock art of anacondas, rodents and other animals along the Orinoco River in Colombia and Venezuela may have been used to mark territory 2,000 years agoBy Stephanie PappasArtistic impressions of a mythical snake traversing the Orinoco River. Dr Philip RirisSkim along the Orinoco River on the border between Venezuela and Colombia, and you

June 3, 2024

Science and Medical

Average Body Temperature Takes A Dip

There are a few things everyone knows, right? No two snowflakes are exactly alike, groundhogs can’t really predict the weather and normal body temperature is 98.6 degrees Fahrenheit. The first two facts are pretty sound. But the last one may no longer be true. Where did the 98.6 dogma come from in the first place?…

February 15, 2022

Science and Medical

Hypersonic Weapons Can’t Hide from New Eyes in Space

Tracking the missiles is like picking out one light bulb against a background of light bulbs, but new technology aims to see them more clearlyArtist's rendering of a hypersonic missile. Credit: DARPA China’s test flight of a long-range hypersonic glide vehicle late last year was described in the media as close to a “Sputnik moment”…

January 18, 2022

Science and Medical

Ako nezomrieť predčasne? Prinášame návod na ozdravenie srdca

Existujú vety, ktoré náš mozog jednoducho nevzrušia. A výrok – staraj sa o svoje srdce – k nim zaručene patrí. Možno aj vám znie moralizátorsky, abstraktne alebo vágne. Nedajte sa tým však oklamať. Máločo je natoľko konkrétne a zmysluplné, ako rozmeniť si túto zdanlivo banálnu pravdu na drobné a nerozšíriť rady predčasných úmrtí (a strát…

February 18, 2022

Hand-Picked Top-Read Stories

The United Charms of Baseball

Watching an American Election from Across the Pond

The Influence of Sedona Prince

Trending Tags

Ready-to-use public infrastructure for global SARS-CoV-2 monitoring

References

Acknowledgements

Author information

Affiliations

Corresponding authors

Additional information

Supplementary information

About this article

Dark MAGA Hat,Black MAGA Hat 2024 Blackout MAGA Hat Trump Hats Never Surrender All Black on Black Hat

Mio Angle Bracelet Making Kit for Beginner 5000Pcs Preppy Polymer Friendship Bracelet Making with Charms Kit for Jewelry Making DIY Arts and Crafts Birthday Gifts Toys for Kids Age 6-13

Yuzcxxx Musk's MAGA Hat Donald Trump 45-47 Never Surrender MAGA Hat Trump 2024 USA Embroidered Adjustable Baseball Cap Hats Black

Schylling NeeDoh Dream Drop - Sensory Toy with Groovy Goo Filling - 3" Tall - Color May Vary (Pack of 1)

Countertop Ice Maker, Ice Maker Machine 6 Mins 9 Ice, 26.5lbs/24Hrs, Portable Ice Maker Machine with Self-Cleaning, Ice Scoop, and Basket, Compact Ice Maker for Home/Kitchen/Office/Party

Jaylen Brown’s Brilliance Wows NBA Twitter as Celtics Beat CJ McCollum, Pelicans

Apps and electronic devices can help improve your quality of life

Independent Computer Consulting Group Bolsters Its Solution Offering for Process Manufacturing Businesses with Infor Cloudsuite PLM for Process (Optiva)

‘I’m a grass-cutter!’ | Robert MacIntyre and his dad share emotional moment after first PGA Tour victory | Golf News | Sky Sports

Could Cummins turn to ultimate captaincy gamble?

The United Charms of Baseball

Watching an American Election from Across the Pond

The Influence of Sedona Prince

Trump’s Final Days on the Campaign Trail

New Yorkers urged to conserve water after driest October in 150 years

Ready-to-use public infrastructure for global SARS-CoV-2 monitoring

References

Acknowledgements

Author information

Affiliations

Corresponding authors

Additional information

Supplementary information

About this article

Related Posts