Documentation

cleave

Cleave amino acid sequence with enzyme

Syntax

Fragments = cleave(SeqAA, Enzyme)
Fragments = cleave(SeqAA, PeptidePattern, Position)
[Fragments, CuttingSites] = cleave(...)
[Fragments, CuttingSites, Lengths] = cleave(...)
[Fragments, CuttingSites, Lengths, Missed] = cleave(...)
cleave(..., 'PartialDigest', PartialDigestValue, ...)
cleave(..., 'MissedSites', MissedSitesValue, ...)
cleave(..., 'Exception', ExceptionValue, ...)

Input Arguments

SeqAA

One of the following:

  • String of single-letter codes specifying an amino acid sequence.

  • Row vector of integers specifying an amino acid sequence.

  • MATLAB® structure containing a Sequence field that contains an amino acid sequence, such as returned by fastaread, getgenpept, genpeptread, getpdb, or pdbread.

Examples: 'ARN' or [1 2 3].

Enzyme

String specifying a name or abbreviation code for an enzyme or compound for which the literature specifies a cleavage rule.

    Tip   Use the cleavelookup function to display the names of enzymes and compounds in the cleavage rule library.

PeptidePattern

Short amino acid sequence to search for in SeqAA, a larger sequence. PeptidePattern can be any of the following:

Position

Integer from 0 to the length of the PeptidePattern, that specifies a position in the PeptidePattern to cleave.

    Note:   Position 0 corresponds to the N terminal end of PeptidePattern.

PartialDigestValue

Value from 0 to 1 (default) specifying the probability that a cleavage site will be cleaved.

MissedSitesValue

Nonnegative integer specifying the maximum number of missed cleavage sites. The output includes all possible peptide fragments that can result from missing MissedSitesValue or less cleavage sites. Default is 0, which is equivalent to an ideal digestion.

ExceptionValue

Regular expression specifying an exception rule to the cleavage rule associated with Enzyme. By default, cleave applies no exception rule.

Output Arguments

Fragments

Cell array of strings representing the fragments from the cleavage.

CuttingSites

Numeric vector containing indices representing the cleavage sites.

    Note:   The cleave function adds a 0 to the list, so numel(CuttingSites)==numel(Fragments). Use CuttingSites + 1 to point to the first amino acid of every fragment respective to the original sequence.

Lengths

Numeric vector containing the length of each fragment.

Missed

Numeric vector containing the number of missed cleavage sites for every peptide fragment.

Description

Fragments = cleave(SeqAA, Enzyme) cuts SeqAA, an amino acid sequence, into parts at the cleavage sites specific for Enzyme, a string specifying a name or abbreviation code for an enzyme or compound for which the literature specifies a cleavage rule. It returns Fragments, a cell array of strings representing the fragments from the cleavage.

    Tip   Use the cleavelookup function to display the names of enzymes and compounds in the cleavage rule library.

Fragments = cleave(SeqAA, PeptidePattern, Position) cuts SeqAA, an amino acid sequence, into parts at the cleavage sites specified by a peptide pattern and position.

[Fragments, CuttingSites] = cleave(...) returns a numeric vector containing indices representing the cleavage sites.

    Note:   The cleave function adds a 0 to the list, so numel(CuttingSites)==numel(Fragments). Use CuttingSites + 1 to point to the first amino acid of every fragment respective to the original sequence.

[Fragments, CuttingSites, Lengths] = cleave(...) returns a numeric vector containing the length of each fragment.

[Fragments, CuttingSites, Lengths, Missed] = cleave(...) returns a numeric vector containing the number of missed cleavage sites for every fragment.

cleave(..., 'PropertyName', PropertyValue, ...) calls cleave with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Enclose each PropertyName in single quotation marks. Each PropertyName is case insensitive. These property name/property value pairs are as follows:

cleave(..., 'PartialDigest', PartialDigestValue, ...) simulates a partial digestion where PartialDigestValue is the probability of a cleavage site being cut. PartialDigestValue is a value from 0 to 1 (default).

This table lists some common proteases and their cleavage sites.

ProteasePeptide PatternPosition
Aspartic acid ND1
Chymotrypsin[WYF](?!P)1
Glutamine C[ED](?!P) 1
Lysine C[K](?!P) 1
Trypsin[KR](?!P)1

cleave(..., 'MissedSites', MissedSitesValue, ...) returns all possible peptide fragments that can result from missing MissedSitesValue or less cleavage sites. MissedSitesValue is a nonnegative integer. Default is 0, which is equivalent to an ideal digestion.

cleave(..., 'Exception', ExceptionValue, ...) specifies an exception rule to the cleavage rule associated with Enzyme. ExceptionValue is a regular expression. By default, cleave applies no exception rule.

Examples

  1. Retrieve a protein sequence from the GenPept database.

    S = getgenpept('AAA59174');
  2. Cleave the sequence using proteinase K.

    [partsPK, sitesPK, lengthsPK] = cleave(S.Sequence, ...
     'proteinase K');
  3. Display the indices of the cleavage sites, lengths, and sequences of the first ten fragments.

    for i=1:10
           fprintf('%5d%5d %s\n',sitesPK(i),lengthsPK(i),partsPK{i})
       end
    
     0    3   MGT
     3    6   GGRRGA
     9    1   A
    10    1   A
    11    1   A
    12    2   PL
    14    1   L
    15    1   V
    16    1   A
    17    1   V
  4. Cleave the same sequence using one of trypsin's cleavage rules (cleave after K or R when the next residue is not P).

    [partsT, sitesT, lengthsT] = cleave(S.Sequence,'[KR](?!P)',1);
    
  5. Display the indices of the cleavage sites, lengths, and sequences of the first ten fragments.

     for i=1:10
            fprintf('%5d%5d   %s\n',sitesT(i),lengthsT(i),partsT{i})
        end
    
      0    6   MGTGGR
      6    1   R
      7   34   GAAAAPLLVAVAALLLGAAGHLYPGEVCPGMDIR
     41    5   NNLTR
     46   21   LHELENCSVIEGHLQILLMFK
     67    7   TRPEDFR
     74    6   DLSFPK
     80   12   LIMITDYLLLFR
     92    8   VYGLESLK
    100   10   DLFPNLTVIR
  6. Cleave the same sequence using trypsin's cleavage rule, but allow for one missed cleavage site.

    [partsT2, sitesT2, lengthsT2, missedT2] = cleave(S.Sequence, ...
                                           'trypsin','missedsites',1);
  7. Cleave the same sequence using trypsin's cleavage rule, except do not to cleave after K when K is following by a D.

    partsT3 = cleave(S.Sequence, 'trypsin', 'exception', 'KD');
Was this topic helpful?