E-mail Abstract Author Session Search Abstracts Program


Session 59 Poster Presentations
Viral Genetic Diversity
Session Day and Time: Thursday 1:30 - 3:30 pm
Room: Hall A


499
Compartment-specific Patterns Associated with CSF-derived V3 Sequences
Benjamin Good1, Satish Pillai*2, Jacques Corbeil2, Joseph Wong2
1Veterans Med Res Fndn, San Diego, CA and 2Univ of California at San Diego

Background: It has been hypothesized that the selective environment within the CNS compartment differs from that of peripheral tissues. Therefore, it is expected that virus replicating within the CNS should develop distinct genotypic and phenotypic characteristics from peripheral virus. V3 genotype is likely to differ, based on the observation that the predominant target cell type within each compartment is distinct. We applied a battery of machine learning techniques to discriminate between CSF- and plasma-derived V3 amino acid sequences.

Methods: All 188 CSF V3 sequences were down-loaded from the Los Alamos HIV Sequence Database. A matching set of 188 plasma-derived V3 sequences were randomly chosen from the same database. The entire sequence set was aligned using ClustalW. A subset of algorithms within the Weka (Waikato Environment for Knowledge Analysis) machine learning suite was implemented to classify V3 amino acid sequences based on their compartment of origin. Ten (10)-fold cross-validation was used to assess the accuracy of all tested classifiers.

Results: The support vector machine constituted the best overall classifier, categorizing V3 sequences based on their compartment of origin with 90.6% accuracy. A decision tree inducer achieved an accuracy of 89.6% in 10-fold cross-validation and revealed a complex sequence signature associated with compartmentalization.

Conclusions: The performance of these classifiers suggests that there is a strong compartment-specific sequence signature. Unlike signatures associated with other phenotypes, e.g., primary drug resistance and coreceptor usage, the underlying pattern was complex and comprised many positions across the V3 sequence. The discovery of compartment-specific genotypic characteristics may prove invaluable in tracking evolutionary patterns during HIV-1 infection. In addition, the deciphering of these patterns by the support vector machine and decision tree inducer speak to the efficacy of a machine learning approach to sequence-based classification.