Slavov lab | Quantitative Biology
Introduction to the principles of
Mass spectrometry analysis


Lectures

An introduction to the history of mass spectrometry and the basic principles by which it identifies the chemical composition of analytes.






An introduction to the basic principles for quantitative mass spectrometry analysis of proteins.






An introduction to data analysis with focus on mass spectrometry proteomics. Topics include data normalization, quality control, bias correction, clustering, and gene set enrichment analysis.






Data integration and analysis. Standards for benchmarking quantification.






An introduction to incorporating data reliability into analysis with a focus on errors-in-variables modeling and data analysis.






Data analysis probelms

The two probelms for the mass-spectromety module are listed below. Their solutions should be submitted to GitHub, and the links to the GitHub repositories submitted via blackboard or Slack. You may find this tutorial how to use GitHub useful. I recommend using GitHub as an introduction to version control and a learning experience that will be helpful for collaborating with others and sharing your code for published papers. The problems below intentially do not specify all details and parameters to leave space for your creativity. Use the freedom.
  1. Gene set enrichment analysis
  2. Download the consensus dataset of protein levels across human tissues from Post-transcriptional regulation across human tissues. Perform gene set enrichment analysis based on protein abundance variation across the tissues and display the results for the 50 gene ontology (GO) terms that differ most acorss the 13 tissues. You can display the results using a heatmap similar to Fig 3.

  3. Total least squares analysis: A practice problem demonstrating biases in the results from partial least squares and how they can be corrected by using total least squares.
    • Simulate data for 1000 problems
      1. Generate vector x by sampling N data points from a uniform distribution, and compute vector y as y = 2*x
      2. Add Gaussian noise with mean zero and variance σ2 N(0, σ2) to both x and y to simulate the measurement noise.
    • Solve the problems and visualize the results, e.g., inferred slope distributions, for the following parameters:
      1. N = 50, 100, 500, 1000
      2. σ2 = 0.1, 0.25, 0.5, 0.75, 1