Home Search Abstracts View Session E-mail Abstract Author


Session 57 Poster Abstracts
Factors Impacting Disease Progression
Session Day and Time: Monday, 1-4 pm
Room: Hall D


306    
Viroverse: A Research Database and Bioinformatics Analysis Framework
Brandon Maust*1, W Deng1, J Stoddard1, Z Frazier2, M Guerquin1, G Learn1, R Samudrala1, R Bumgarner1, and J Mullins1
1Univ of Washington, Seattle, US and 2Univ of Southern California, Los Angeles, US

Background:  Groups around the world have collected pathogen and human gene sequence data and relevant clinical and laboratory parameters from large cohorts. While useful for their original purpose in supporting basic research, major issues confront researchers who wish to mine these data sources. Obtaining data via repository-specific request procedures by submission of a specific hypothesis and conversion of heterogeneously collected data to a useful common format and encoding are often encumbrances to beginning a study.
Methods:  Using the Seattle Primary Infection Project and Multi-Center AIDS Cohort Study (MACS) cohorts as prototypes, we developed a database and toolkit, which together constitute a software infrastructure for the acquisition, retention, and evaluation of clinical, laboratory, and genetic data derived from human hosts and their infecting pathogens. Focusing on HIV, we built a highly normalized relational database structure specific to the molecular biology and attendant data of viral pathogens and a series of tools to capture experimental data and couple it to analysis.
Results:  This database, Viroverse, currently includes >1800 subject records, the majority of which have complete information including: medical history, demographic, laboratory tests, risk assessment and sexual behavioral data, host genetic markers, viral gene sequences, and cytotoxic T lymphocyte (CTL) recognition data. Existing data were assembled from a variety of heterogeneous formats using a generalized data loading interface. Additional entry forms capture experimental data from sample acquisition and beyond. EpitopeDB is a specialized interface to collect and query subject data from enzyme-linked immunosorbent spot assay (ELISpot) reactivity experiments. Diver is an automated interface to standard phylogenetic analyses of gene sequences.
Conclusions:  Viroverse is proving to be a useful too for handling large amounts of data generated within a lab and through collaborations with other groups. A consistent analytical framework and means for interchanging information on viral infections facilitate efficient manual exploration and extraction of data and development of new tools for discovery-based data mining approaches. The former will facilitate hypothesis testing while the latter will allow rapid exploration of statistically significant correlations that, in turn, will generate novel hypotheses and increase the pace at which scientific exploration can proceed.