| Abstract: |
Mass spectrometry has become the most widely used tool for the characterization of proteins within complex mixtures. In this talk, I will describe several successful applications of machine learning to improve the rate at which we can correctly assign peptide sequences to observed tandem mass spectra. We use supervised and semi-supervised
discriminative learning methods to train a classifier that discriminates between correctly and incorrectly annotated spectra. Unlike previous methods, the classifier can be trained dynamically on each given data set, thereby adjusting to particular characteristics of the sample preparation protocol, machine platform, calibration and chromatography conditions. We have also trained a dynamic Bayesian network to model the process of peptide fragmentation within the mass
spectrometer. The resulting model yields useful insights into fragmentation biochemistry as well as significantly improved peptide identification performance. |