Bioinformatic requirements for protein database searching using predicted epitopes from disease-associated antibodies.

Printer-friendly versionPrinter-friendly versionPDF versionPDF version
TitleBioinformatic requirements for protein database searching using predicted epitopes from disease-associated antibodies.
Publication TypeJournal Article
Year of Publication2008
AuthorsBastas, G, Sompuram, SR, Pierce, BG, Vani, K, Bogen, SA
JournalMol Cell Proteomics
Volume7
Issue2
Pagination247-56
Date Published2008 Feb
ISSN1535-9484
KeywordsAmino Acid Motifs, Amino Acid Sequence, Antibodies, Antibodies, Monoclonal, Antigens, Bacteriophages, Computational Biology, Consensus Sequence, Databases, Protein, Disease, Epitopes, Humans, Immunoblotting, Molecular Sequence Data, Peptides, Reproducibility of Results
Abstract

We describe a new approach to identify proteins involved in disease pathogenesis. The technology, Epitope-Mediated Antigen Prediction (E-MAP), leverages the specificity of patients' immune responses to disease-relevant targets and requires no prior knowledge about the protein. E-MAP links pathologic antibodies of unknown specificity, isolated from patient sera, to their cognate antigens in the protein database. The E-MAP process first involves reconstruction of a predicted epitope using a peptide combinatorial library. We then search the protein database for closely matching amino acid sequences. Previously published attempts to identify unknown antibody targets in this manner have largely been unsuccessful for two reasons: 1) short predicted epitopes yield too many irrelevant matches from a database search and 2) the epitopes may not accurately represent the native antigen with sufficient fidelity. Using an in silico model, we demonstrate the critical threshold requirements for epitope length and epitope fidelity. We find that epitopes generally need to have at least seven amino acids, with an overall accuracy of >70% to the native protein, in order to correctly identify the protein in a nonredundant protein database search. We then confirmed these findings experimentally, using the predicted epitopes for four monoclonal antibodies. Since many predicted epitopes often fail to achieve the seven amino acid threshold, we demonstrate the efficacy of paired epitope searches. This is the first systematic analysis of the computational framework to make this approach viable, coupled with experimental validation.

DOI10.1074/mcp.M700107-MCP200
Alternate JournalMol. Cell Proteomics
PubMed ID17897933
Grant ListCA106847 / CA / NCI NIH HHS / United States
CA94557 / CA / NCI NIH HHS / United States