Contenido principal

isotopicdist

Calculate high-resolution isotope mass distribution and density function

Syntax

[MD, Info, DF] = isotopicdist(SeqAA)
[MD, Info, DF] = isotopicdist(Compound)
[MD, Info, DF] = isotopicdist(Formula)
isotopicdist(..., 'NTerminal', NTerminalValue, ...)
isotopicdist(..., 'CTerminal', CTerminalValue, ...)
isotopicdist(..., 'Resolution', ResolutionValue, ...)
isotopicdist(..., 'FFTResolution', FFTResolutionValue, ...)
isotopicdist(..., 'FFTRange', FFTRangeValue, ...)
isotopicdist(..., 'FFTLocation', FFTLocationValue, ...)
isotopicdist(..., 'NoiseThreshold', NoiseThresholdValue, ...)
isotopicdist(..., 'ShowPlot', ShowPlotValue, ...)

Description

[MD, Info, DF] = isotopicdist(SeqAA) analyzes a peptide sequence and returns a matrix containing the expected mass distribution; a structure containing the monoisotopic mass, average mass, most abundant mass, nominal mass, and empirical formula; and a matrix containing the expected density function.

[MD, Info, DF] = isotopicdist(Compound) analyzes a compound specified by a numeric vector or matrix.

[MD, Info, DF] = isotopicdist(Formula) analyzes a compound specified by an empirical chemical formula represented by the structure Formula. The field names in Formula must be valid element symbols and are case sensitive. The respective values in Formula are the number of atoms for each element. Formula can also be an array of structures that specifies multiple formulas. The field names can be in any order within a structure. However, if there are multiple structures, the order must be the same in each.

isotopicdist(..., 'PropertyName', PropertyValue, ...) calls isotopicdist with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Enclose each PropertyName in single quotation marks. Each PropertyName is case insensitive. These property name/property value pairs are as follows:

isotopicdist(..., 'NTerminal', NTerminalValue, ...) modifies the N-terminal of the peptide.

isotopicdist(..., 'CTerminal', CTerminalValue, ...) modifies the C-terminal of the peptide.

isotopicdist(..., 'Resolution', ResolutionValue, ...) specifies the approximate resolution of the instrument, given as the Gaussian width (in daltons) at full width at half height (FWHH).

isotopicdist(..., 'FFTResolution', FFTResolutionValue, ...) specifies the number of data points per dalton, to compute the fast Fourier transform (FFT) algorithm.

isotopicdist(..., 'FFTRange', FFTRangeValue, ...) specifies the absolute range (window size) in daltons for the FFT algorithm and output density function.

isotopicdist(..., 'FFTLocation', FFTLocationValue, ...) specifies the location of the FFT range (window) defined by FFTRangeValue. It specifies this location by setting the location of the lower limit of the range, relative to the location of the monoisotopic peak, which is computed by isotopicdist.

isotopicdist(..., 'NoiseThreshold', NoiseThresholdValue, ...) removes points in the mass distribution that are smaller than 1/NoiseThresholdValue times the most abundant mass.

isotopicdist(..., 'ShowPlot', ShowPlotValue, ...) controls the display of a plot of the mass distribution.

Input Arguments

SeqAA

Peptide sequence specified by either a:

  • Character vector or string of single-letter codes

  • Cell array of character vectors or string vector that specifies multiple peptide sequences

Tip

You can use the getgenpept and genpeptread functions to retrieve peptide sequences from the GenPept database or a GenPept-formatted file. You can then use the cleave function to perform an insilico digestion on a peptide sequence. The cleave function creates a cell array of character vectors representing peptide fragments, which you can submit to the isotopicdist function.

Compound

Compound specified by either a:

  • Numeric vector of form [C H N O S], where C, H, N, O, and S are nonnegative numbers that represent the number of atoms of carbon, hydrogen, nitrogen, oxygen, and sulfur respectively in a compound.

  • M-by-5 numeric matrix that specifies multiple compounds, with each row corresponding to a compound and each column corresponding to an atom.

Formula

Chemical formula specified by either a:

  • Structure whose field names are valid element symbols and case sensitive. Their respective values are the number of atoms for each element.

  • Array of structures that specifies multiple formulas.

Note

If Formula is a single structure, the order of the fields does not matter. If Formula is an array of structures, then the order of the fields must be the same in each structure.

NTerminalValue

Modification for the N-terminal of the peptide, specified by either:

  • One of 'none', 'amine' (default), 'formyl', or 'acetyl'

  • Custom modification specified by an empirical formula, represented by a structure. The structure must have field names that are valid element symbols and case sensitive. Their respective values are the number of atoms for each element.

CTerminalValue

Modification for the C-terminal of the peptide, specified by either:

  • One of 'none', 'freeacid' (default), or 'amide'

  • Custom modification specified by an empirical formula, represented by a structure. The structure must have field names that are valid element symbols and case sensitive. Their respective values are the number of atoms for each element.

ResolutionValue

Value in daltons specifying the approximate resolution of the instrument, given as the Gaussian width at full width half height (FWHH).

Default: 1/8 Da

FFTResolutionValue

Value specifying the number of data points per dalton, used to compute the FFT algorithm.

Default: 1000

FFTRangeValue

Value specifying the absolute range (window size) in daltons for the FFT algorithm and output density function. By default, this value is automatically estimated based on the weight of the molecule. The actual FFT range used internally by isotopicdist is further increased such that FFTRangeValue * FFTResolutionValue is a power of two.

Tip

Increase the FFTRangeValue if the signal represented by the DF output appears to be truncated.

Tip

Ultrahigh resolution allows you to resolve micropeaks that have the same nominal mass, but slightly different exact masses. To achieve ultrahigh resolution, increase FFTResolutionValue and reduce ResolutionValue, but ensure that FFTRangeValue * FFTResolutionValue is within the available memory.

FFTLocationValue

Fraction that specifies the location of the FFT range (window) defined by FFTRangeValue. It specifies this location by setting the location of the lower limit of the FFT range, relative to the location of the monoisotopic peak, which is computed by isotopicdist. The location of the lower limit of the FFT range is set to the mass of the monoistopic peak - (FFTLocationValue * FFTRangeValue).

Tip

You may need to shift the FFT range to the left in rare cases where a compound contains an element, such as Iron or Argon, whose most abundant isotope is not the lightest one.

Default: 1/16

NoiseThresholdValue

Value that removes points in the mass distribution that are smaller than 1/NoiseThresholdValue times the most abundant mass.

Default: 1e6

ShowPlotValue

Controls the display of a plot of the isotopic mass distribution. Choices are true, false, or I, which is an integer specifying a compound. If set to true, the first compound is plotted. Default is:

  • false — When you specify return values.

  • true — When you do not specify return values.

Output Arguments

MD

Mass distribution represented by a two-column matrix in which each row corresponds to an isotope. The first column lists the isotopic mass, and the second column lists the probability for that mass.

Info

Structure containing mass information for the peptide sequence or compound in the following fields:

  • NominalMass

  • MonoisotopicMass

  • ObservedAverageMass — Estimated from the DF signal output, using instrument resolution specified by the 'Resolution' property.

  • CalculatedAverageMass — Calculated directly from the input formula, assuming perfect instrument resolution.

  • MostAbundantMass

  • Formula — Structure containing the number of atoms of each element.

DF

Density function represented by a two-column matrix in which each row corresponds to an m/z value. The first column lists the mass, and the second column lists the relative intensity of the signal at that mass.

Examples

Calculate and display the isotopic mass distribution of the peptide sequence MATLAP with an Acetyl N-terminal and an Amide C-terminal:

MD = isotopicdist('MATLAP','nterm','Acetyl','cterm','Amide', ...
                  'showplot',true)

MD =

  643.3363    0.6676
  644.3388    0.2306
  645.3378    0.0797
  646.3386    0.0181
  647.3396    0.0033
  648.3409    0.0005
  649.3423    0.0001
  650.3439    0.0000
  651.3455    0.0000

Calculate and display the isotopic mass distribution of Glutamine (C5H10N2O3):

MD = isotopicdist([5 10 2 3 0],'showplot',true)

MD =

  146.0691    0.9328
  147.0715    0.0595
  148.0733    0.0074
  149.0755    0.0004
  150.0774    0.0000

Display the isotopic mass distribution of the "averagine" model, whose molecular formula represents the statistical occurrences of amino acids from all known proteins:

isotopicdist([4.9384 7.7583 1.3577 1.4773 0.0417])

More About

collapse all

References

[1] Rockwood, A. L., Van Orden, S. L., and Smith, R. D. (1995). Rapid Calculation of Isotope Distributions. Anal. Chem. 67:15, 2699–2704.

[2] Rockwood, A. L., Van Orden, S. L., and Smith, R. D. (1996). Ultrahigh Resolution Isotope Distribution Calculations. Rapid Commun. Mass Spectrum 10, 54–59.

[3] Senko, M.W., Beu, S. C., and McLafferty, F. W. (1995). Automated assignment of charge states from resolved isotopic peaks for multiply charged ions. J. Am. Soc. Mass Spectrom. 6, 52–56.

[4] Senko, M.W., Beu, S. C., and McLafferty, F. W. (1995). Determination of monoisotopic masses and ion populations for large biomolecules from resolved isotopic distributions. J. Am. Soc. Mass Spectrom. 6, 229–233.

Version History

Introduced in R2009b