Protein Sequencing

In a nutshell, proteins are digested to peptides which are separated by liquid chromatography and ionized for introduction into a mass spectrometer where they are fragmented in a process call tandem mass spectrometry. The resulting fragmentation pattern is compared to theoretical fragmentation patterns of peptides of similar mass derived from a protein sequence library and scored based on the quality of the match. A single match can be sufficient to confidently identify a protein and categorize it as present in the sample. Multiple matches to multiple peptide sequences from the same protein sequence can, in aggregate, be used to validate that a protein with the predicted protein sequence is present in the sample.

The “shotgun” in shotgun proteomics refers to the process of cleaving one or more proteins into pieces. These pieces are called peptides. trypsin.

High-performance liquid chromatography (HPLC) is a method used to separate analytes according to one or more properties of the analytes. In the case of shotgun proteomics, tryptic peptides are most often separated based on their hydrophobicity using a reversed phase column. This is done to both slow the entry of peptides in the mass spectrometer and to maximize the concentration of each peptide as it enters the mass spectrometer by separating them from each other. A mass spectrometer can only sample a finite number of peptides in a unit time (about 10 per second on a Thermo Q-Exactive Orbitrap). Separation of the peptides is accomplished by running an increasing concentration of acetonitrile through the column. As the concentration of acetonitrile increases, increasingly hydrophobic peptides will elute from the column for introduction into the mass spectrometer.

Nanoflow liquid chromatography (nLC) can improve overall sensitivity by decreasing the volume in which a peptide elutes, thereby increasing the concentration of the peptide per unit time as it enters the mass spectrometer. Running nLC involves using small diameter columns (typically 50 or 75 micron inner diameter) and nanoliter per minute flow rates (typically 100 - 300 nanoliters per minute) and requires the use of an HPLC capable of pumping a gradient at nanoliter per minute flow rates Thermo Easy-nLC II or 1000).

Longer chromatographic columns and smaller diameter chromatographic media coupled with nLC can further improve chromatographic resolution, which increases peptide concentration per unit time, resulting in increased sensitivity. However, longer columns and smaller diameter media require increased pressure to pump the separation gradient through the column. Ultra-high pressure chromatography (UPLC) (Thermo Easy-nLC 1000) can deliver the pressure necessary to perform separations on 50 cm columns with 2-micron media. This enables highly resolving separations of 4 hours or more, negating, for many applications, the need for pre-fractionation of the peptides prior to UPLC.

Peptides must be transferred from liquid- to gas-phase for introduction to a mass spectrometer. There are numerous ionization methods, but the two most commonly used utilize either a laser or an electric current for the transfer. These methods shared the 2002 award for the Nobel Prize in Chemistry.

Tandem mass spectrometry is the process by which a peptide is isolated, fragmented and detected inside a mass spectrometer. Once a mass spectrometer detects the presence of a peptide (typically as a high intensity, multiply-charged ion) a decision is made to isolate the ion within a defined mass-to-charge (m/z) window. Isolation can be done “in time” (on time-of-flight devices) or “in space” (on trap devices). Once isolated, the peptide is fragmented. There are numerous fragmentation methods. Two of the more common methods are termed collision induced dissociation (CID, alternately collision activated dissociation or CAD) and higher energy collisional dissociation (HCD). Both methods fragment a peptide primarily along the peptide backbone, generating a population of peptide ions with one or more amino acids cleaved from either end. Peptide ions with amino acids removed from the C-terminus and N-terminus are denoted b-ions and y-ions, respectively. CID typically yields both b- and y-ions, potentially allowing for the peptide to be sequenced from both ends. HCD typically yields primarily y-ions as well as immonium ions in the low mass region.

Shotgun proteomic search algorithms take as input a peak list file containing data from a single tandem mass spectrum including: the precursor (intact) mass of the peptide isolated for fragmentation, the charge state of the peptide and the masses and intensities of each of the fragment ions. The algorithm takes the protein sequence library defined in the search (typically the entire sequence library for a species such as Homo sapiens) and performs an "*in silico*" digestion to generate all possible peptides with tryptic ends. From this set of peptides a subset is selected which have a precursor mass within the mass tolerance range defined in the search algorithm parameters (typically +/- 1.5 Da on on ion trap and +/- 20 ppm on an Orbitrap). The algorithm then generates theoretical fragmentation patterns for each peptide in this subset, compares the experimentally-obtained fragmentation spectrum to each of these and assigns a score to each. Various methods are used to estimate the likelihood that the highest scoring match is a random event. Some algorithms, such as OMSSA and X!Tandem, calculate expectation values. For these, an expectation value score of 0.01 indicates that there is a 1 in 100 chance that the match is a random event. In other words, in a population of 100 matches with expectation value scores of 0.01 we would expect that one of the matches would be a random event, that is, a false positive identification, but we would not know which one.

A shotgun proteomics search algorithm uses two mass tolerance filters. One for the precursor ion mass measurement (intact peptide mass) and another for the fragment ion mass measurements. For measurements with higher mass accuracy these tolerances may be tightened, such that there are fewer candidates for a match. This results in a higher score for a given tandem mass spectrum, decreasing the likelihood of a false positive assignment. As mass accuracy increases and tighter mass tolerances are used, fewer theoretical mass spectra need to be evaluated by the algorithm which decreases the amount of time necessary for the algorithm to complete. An ion trap mass spectrometer typically has nominal mass accuracy. That is, a mass assignment is accurate to within about +/- 1 Da. An Orbitrap mass spectrometer has a mass accuracy of better than +/- 10 ppm. The difference between the two is the capability of being able or not being able to distinguish glutamine from lysine, a peptide from its deamidated counterpart, and a tri-methylation modification from an acetylation modification.

Increasing mass resolution enables increasing capability to determine the charge state of a peptide in shotgun proteomics experiments by resolving the isotopic envelope. This allows the set of theoretical spectrum candidate matches to be narrowed, increasing the resulting score and decreasing the likelihood of a false positive match, similar to better mass accuracy. Increasing mass resolution also allows for improved calculation of precursor peak areas by resolving the peaks of peptides with similar masses.

Global Profiling

A few micrograms of peptide digest is typically loaded for an assay. To account for sample losses and to permit re-analysis if necessary, we request about 50 micrograms of total protein per assay. Good results can be obtained from trace sample amounts - we can identify over 1,000 proteins from less than 1,000 cells procured via laser dissection.

Please send in the smallest volume possible. We will acetone precipitate protein in sample volumes greater than 2 mL.

Any species with an available sequence library (FASTA file) can be assayed. If a sequence library is not available for a species or if the library is limited in scope, we can attempt to search a library of related species to search for homologous proteins. There is no extra charge for searching a custom sequence library. We support all UniProt libraries by default. Search UniProt taxonomies to see if your organism is present.

In a typical complex sample such as a cell line we identify, on average, 5 different peptide sequences mapping to each protein and an average of 2 peptide spectral matches (identification events) per peptide.

The Q-Exactive has a sub-femtomole limit of detection (LOD). The LOD, limit of identification and limit of quantification for each peptide will vary based on several factors such as ionization efficiency and fragmentation characteristics.

Yes, we support SILAC and stable isotope dimethyl labeling. We recommend Life Tech reagents for SILAC experiments. We recommend this SILAC protocol and this stable isotope dimethyl labeling protocol.

We prefer SILAC, but it is limited to cell culture experiments. SILAC controls for variability throughout the sample preparation and UPLC-MS/MS assay. In contrast, stable isotope dimethyl labeling can be applied to any sample type post-digestion. Because of this, it only controls for variability post-derivitization - primarily for UPLC-MS/MS.

Another option is to use a super-SILAC spike-in, in which a set of SILAC cultures are mixed to provide an adequate representation of the proteome. This super-SILAC mix is then mixed with the sample. This also controls for variability throughout sample preparation and UPLC-MS/MS.

We do not support isobaric tagging experiments (iTRAQ or TMT). We feel the instrumentation is not yet advanced enough to perform these assays without serious compromises which render this type of experiment less effective than label-free, SILAC or stable isotope dimethyl labeling experiments.

You will receive a free account on Proteome Cluster for viewing all details of the assays, including protein and peptide assignments, tandem mass spectra, run-level metrics, search parameters, etc. All data is available to download or share with your colleagues and is stored indefinitely.

Yes, we offer both iron- and titanium oxide-based phoshpopeptide enrichments. Due to the very low amount of phosphopeptides present in a sample, we recommend providing at least 1-10 mg of total protein. This will permit the identification of thousands of discrete phosphopeptides.

We also offer lectin affinity-based glycopeptide or glycoprotein enrichments.

Yes, we offer albumin, immunoglobulin and multi-protein depletions from fluids such as plasma, urine and saliva.

Have another question not addressed in this FAQ? Email us.