Category
Description
Last updated on 2024-02-11T22:41:43+00:00 by LN Anderson
PNNL DataHub NIAID Program Project: Modeling Host Responses to Understand Severe Human Virus Infections, Multi-Omic Viral Dataset Catalog Collection
Background
The National Institute of Allergy and Infectious Diseases (NIAID) "Modeling Host Responses to Understand Severe Human Virus Infections" project (U19AI106772) was a highly integrated and comprehensive systems biology research consortium, funded by the NIAID Systems Biology Program from 2013-2018, investigating the complex host response to NIAID priority pathogen viral infections. Resulting project deliverables include an extensive and comprehensive systems biology data collection that serves to both enhance predictive modeling of infectious disease and identifying functional regulators of severe human virus pathogenicity for exploiting new developments and therapeutic interventions assisting the human host response to Category A, B, and C priority pathogens.
Impact
Omics lethal human virus (OMICS-LHV) sub-project research activities, conducted at Pacific Northwest National Laboratory (PNNL), included the Proteomics, Metabolomics, and Lipidomics Core (PML) and the Computational Modeling Core (CMC) performed proteomic, metabolomic, lipidomic, and transcriptomic data profiling leveraging world-class mass spectrometry (MS)-based capability technologies and state-of-the-art computational techniques in providing a comprehensive multi-omics collection of viral experimental infection data.
Herein, PNNL sub-projects provide a never before released comprehensive infectious disease dataset collection containing both primary and secondary multi-Omics dataset collections, profiling a series of priority pathogen primary experimental studies for enhanced open access to viral Omics lifecycle datasets and project metadata. Using a highly integrated and multidisciplinary approach, linked primary data and metadata supporting secondary data analysis, provide critical information necessary to support research reproducibility and long-term preservation. Enabling on-demand data access for research community consumption and developer reuse, serves to enhance new insights and discoveries into host-pathogen interactions aiding in future biohazard data preparedness efforts and emergency response to global health crises involving viral infections.
Project Collection Reference Citations
- Anderson, L.N., Eisfeld, A.J., Waters, K.M. PNNL DataHub NIAID Program Project: Modeling Host Responses to Understand Severe Human Virus Infections, Multi-Omic Viral Dataset Catalog Collection. PNNL DataHub (Web). DOI: 10.25584/PRJ.U19AI106772/1971764 (2023).
- Eisfeld, A.J., Anderson, L.N., Fan, S. et al. A compendium of multi-omics data illuminating host responses to lethal human virus infections. Sci Data 11, 328 (2024). https://doi.org/10.1038/s41597-024-03124-3
Reusable Viral Digital Data Project Downloads
Qualitative secondary data viral experimental dataset DOI downloads contain one or more statistically processed files comprised of differential expression analysis data leveraging unique high-resolution instrument capabilities. In addition, all primary metadata files (including experimental designs, dataset summaries, and viral titer workbooks) provide the necessary metadata to support, corroborate, and verify the legitimacy of the viral infection studies reported herein have been included at each experimental dataset DOI download for enhanced transparency and reproducibility. Omics-LHV dataset DOI download (listed below) contain comprehensive time sampled measurement data from proteomic (P), metabolomic (M), lipidomic (L), and/or transcriptomic (T) experimental study that each have a direct relationship to a primary sample data accession corresponding to the Gene Expression Omnibus (GEO) and/or MassIVE domain repository. Collection references below contain multi-Omic host response to Ebola virus (EBOV) infection, Influenza A virus (lAV) infection, West Nile virus (WNV) infection, Middle Eastern Respiratory Syndrome coronavirus (MERS-CoV) infection, and human interferon (IFN) treatment.
Virus/Treatment-Specific Project Collection Reference Citations
- Anderson, Lindsey N, Eisfeld, Amie J, Waters, Katrina M, and Modeling Host Responses to Understand Severe Human Virus Infections Program Project. PNNL DataHub Omics-LHV Project Profiling of the Host Response to Ebola Virus Infection, a Processed Dataset DOI Catalog Experimental Collection. United States. 2021. PNNL DataHub (Web). DOI: 10.25584/LHVEBOV/1784282.
- Anderson, Lindsey N, Eisfeld, Amie J, Waters, Katrina M, and Modeling Host Responses to Understand Severe Human Virus Infections Program Project. PNNL DataHub Omics-LHV Project Profiling of the Host Response to Influenza A Virus Infection, a Processed Dataset DOI Catalog Experimental Collection. United States. 2021. PNNL DataHub (Web). DOI: 10.25584/LHVFLU/1773428.
- Anderson, Lindsey N, Eisfeld, Amie J, Waters, Katrina M, and Modeling Host Responses to Understand Severe Human Virus Infections Program Project. PNNL DataHub Omics-LHV Project Profiling of the Host Response to West Nile Virus Infection, a Processed Dataset DOI Catalog Experimental Collection. United States. 2021. PNNL DataHub (Web). DOI: 10.25584/LHVWNV/1784305.
- Anderson, Lindsey N, Eisfeld, Amie J, Waters, Katrina M, and Modeling Host Responses to Understand Severe Human Virus Infections Program Project. PNNL DataHub Omics-LHV Project Profiling of the Host Interferon-Stimulated Response to Virus Infection, a Processed Dataset DOI Catalog Experimental Collection. United States. 2021. PNNL DataHub (Web). DOI: 10.25584/LHVIFN/1786979.
- Anderson, Lindsey N, Eisfeld, Amie J, Waters, Katrina M, and Modeling Host Responses to Understand Severe Human Virus Infections Program Project. PNNL DataHub Omics-LHV Project Profiling of the Host Response to Middle Eastern Respiratory Syndrome coronavirus Infection, a Processed Dataset DOI Catalog Experimental Collection. United States. 2021. PNNL DataHub (Web). DOI: 10.25584/LHVMERS/1813911.
Source Code Reference Citations
- Kelly G. Stratton & Lisa M. Bramer. (2018). pmartR: Quality Control and Statistics for Mass Spectrometry-Based Biological Data (0.10.0). Zenodo. DOI: 10.5281/zenodo.6108668. URL: https://github.com/pmartR/pmartR/tree/v1.0.0
- Matthew Monroe, Cameron Casey, Grant Fujimoto, Christopher Wilkins, Joon-Yong Lee, Cameron Giberson, & Michael Degan. (2022). LIQUID: an-open source software for identifying lipids in LC-MS/MS-based lipidomics data. DOI: 10.5281/zenodo.6459463. URL: https://github.com/PNNL-Comp-Mass-Spec/LIQUID.
Linked Open Primary Datasets
Raw Microarray and Sequence Measurement Data (Transcriptomics) - 2,555 total datasets
Primary transcriptome experimental data collections and associated metadata are openly available from the NCBI Gene Expression Omnibus (GEO) data repository corresponding to umbrella BioProject PRJNA274402 and GEO Series GSE65575. The GEO database is a public domain community data repository supported by the NIH, for promoting the free exchange of MIAME-compliant gene expression profile and array-based data for reuse and discovery.
Raw Mass Spectrometry Measurement Data (Proteomics, Metabolomics, Lipidomics) - 21,194 total datasets
Primary mass spectrometry proteome, metabolome, and lipidome experimental data and corresponding parameter files, including those used for accurate mass and time (AMT) tag database generation, are openly available for download at the Mass Spectrometry Interactive Virtual Environment (MassIVE) data repository. MassIVE is a public domain community data repository promoting the free exchange of mass spectrometry data for reuse and discovery.
Reusable FAIRsharing Project Standards
Primary Data (raw measurement data)
Mass Spectrometry
- MassIVE: 10.25504/FAIRsharing.LYsiMd
- MIAPE-MS Reporting Guidelines: 10.25504/FAIRsharing.5g1fma
Microarray
- GEO Repository: 10.25504/FAIRsharing.5hc8vt
- MIAME Reporting Guidelines: 10.25504/FAIRsharing.32b10v
Secondary Data (processed measurement data)
Multi-Omics Datasets
- PNNL DataHub: 10.25504/FAIRsharing.45bf5b
Source Code (software)
Data Analysis Tools
- Zenodo: 10.25504/FAIRsharing.wy4egf
- GitHub: 10.25504/FAIRsharing.c55d5e
Acknowledgment of Federal Funding
The data described here was funded in whole or in part by the National Institute of Allergy and Infectious Diseases, of the National Institutes of Health under award number U19AI106772 and is a contribution of the "Modeling Host Responses to Understand Severe Human Virus Infections" Project at Pacific Northwest National Laboratory. Data generated by the Omics-LHV Core for proteomics, metabolomics, and lipidomics analyses for were performed at Pacific Northwest National Laboratory in the Environmental Molecular Sciences Laboratory, a national scientific user facility sponsored by the Department of Energy’s (DOE) Office, operating under the Battelle Memorial Institute for the DOE under contract number DE-AC05-76RLO1830.
Citation Policy
In efforts to enable discovery, reproducibility, and reuse of NIH-funded project dataset citations, we ask that all reuse of project data and metadata download materials acknowledge all primary and secondary dataset citations where applicable and direct corresponding journal articles (Grant U19AI106772) where allowable in accordance with best practices outlined by the FORCE11 Joint Declaration of Data Citation Principles in alignment with NIH acknowledgement requirements.
Data Licensing
CC BY 4.0 (dataset DOI downloads), CC0 1.0 (PNNL DataHub policy default)