Calculate sequence properties of DNA oligonucleotide


SeqProperties = oligoprop(SeqNT) returns the sequence properties for a DNA oligonucleotide as a structure.


SeqProperties = oligoprop(SeqNT,Name,Value) uses additional options specified by one or more Name,Value pair arguments.


  1. Create a random sequence.

    seq = randseq(25)
    seq =
  2. Calculate sequence properties of the sequence.

    S1 = oligoprop(seq)
    S1 = 
                    GC: 36
               GCdelta: 0
              Hairpins: [0x25 char]
                Dimers: 'tAGCTtcatcgttgacttctactaa'
             MolWeight: 7.5820e+003
        MolWeightdelta: 0
                    Tm: [52.7640 60.8629 62.2493 55.2870 54.0293 61.0614]
               Tmdelta: [0 0 0 0 0 0]
                Thermo: [4x3 double]
           Thermodelta: [4x3 double]
  3. List the thermodynamic calculations for the sequence.

    ans =
     -178.5000 -477.5700  -36.1125
     -182.1000 -497.8000  -33.6809
     -190.2000 -522.9000  -34.2974
     -191.9000 -516.9000  -37.7863
  1. Calculate sequence properties of the sequence ACGTAGAGGACGTN.

    S2 = oligoprop('ACGTAGAGGACGTN')
    S2 = 
                    GC: 53.5714
               GCdelta: 3.5714
              Hairpins: 'ACGTagaggACGTn'
                Dimers: [3x14 char]
             MolWeight: 4.3329e+003
        MolWeightdelta: 20.0150
                    Tm: [38.8357 42.2958 57.7880 52.4180 49.9633 55.1330]
               Tmdelta: [1.4643 1.4643 10.3885 3.4633 0.2829 3.8074]
                Thermo: [4x3 double]
           Thermodelta: [4x3 double]
  2. List the potential dimers for the sequence.

    ans =

Input Arguments

DNA oligonucleotide sequence represented by any of the following:

  • Character vector or string containing the letters A, C, G, T, or N

  • Vector of integers containing the integers 1, 2, 3, 4, or 15

  • Structure containing a Sequence field that contains a nucleotide sequence

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: 'Replicates',5 specifies to repeat the algorithm five times.

Specify a salt concentration in moles/liter for melting temperature calculations

Example: 'Salt',0.02

Specify the temperature in degrees Celsius for nearest-neighbor calculations of free energy.

Example: 'Temp',20

Specify the concentration in moles/liter for melting temperatures.

Example: 'Primerconc',40e-6

Specify the minimum number of paired bases that form the neck of the hairpin.

Example: 'HPBase',6

Specify the minimum number of bases that form the loop of a hairpin.

Example: 'HPLoop',2

Specify the minimum number of aligned bases between the sequence and its reverse

Example: 'Dimmerlength',6

Output Arguments

Sequence properties for a DNA oligonucleotide as a structure with the following fields:

GCPercent GC content for the DNA oligonucleotide. Ambiguous N characters in SeqNT are considered to potentially be any nucleotide. If SeqNT contains ambiguous N characters, GC is the midpoint value, and its uncertainty is expressed by GCdelta.
GCdeltaThe difference between GC (midpoint value) and either the maximum or minimum value GC could assume. The maximum and minimum values are calculated by assuming all N characters are G/C or not G/C, respectively. Therefore, GCdelta defines the possible range of GC content.
HairpinsH-by-length(SeqNT) matrix of characters displaying all potential hairpin structures for the sequence SeqNT. Each row is a potential hairpin structure of the sequence, with the hairpin forming nucleotides designated by capital letters. H is the number of potential hairpin structures for the sequence. Ambiguous N characters in SeqNT are considered to potentially complement any nucleotide.
Dimers D-by-length(SeqNT) matrix of characters displaying all potential dimers for the sequence SeqNT. Each row is a potential dimer of the sequence, with the self-dimerizing nucleotides designated by capital letters. D is the number of potential dimers for the sequence. Ambiguous N characters in SeqNT are considered to potentially complement any nucleotide.
MolWeightMolecular weight of the DNA oligonucleotide. Ambiguous N characters in SeqNT are considered to potentially be any nucleotide. If SeqNT contains ambiguous N characters, MolWeight is the midpoint value, and its uncertainty is expressed by MolWeightdelta.
MolWeightdeltaThe difference between MolWeight (midpoint value) and either the maximum or minimum value MolWeight could assume. The maximum and minimum values are calculated by assuming all N characters are G or C, respectively. Therefore, MolWeightdelta defines the possible range of molecular weight for SeqNT.

A vector with melting temperature values, in degrees Celsius, calculated by six different methods, listed in the following order:

Ambiguous N characters in SeqNT are considered to potentially be any nucleotide. If SeqNT contains ambiguous N characters, Tm is the midpoint value, and its uncertainty is expressed by Tmdelta.

TmdeltaA vector containing the differences between Tm (midpoint value) and either the maximum or minimum value Tm could assume for each of the six methods. Therefore, Tmdelta defines the possible range of melting temperatures for SeqNT.

4-by-3 matrix of thermodynamic calculations.

The rows correspond to nearest-neighbor parameters from:

The columns correspond to:

  • delta H — Enthalpy in kilocalories per mole, kcal/mol

  • delta S — Entropy in calories per mole-degrees Kelvin, cal/(K)(mol)

  • delta G — Free energy in kilocalories per mole, kcal/mol

Ambiguous N characters in SeqNT are considered to potentially be any nucleotide. If SeqNT contains ambiguous N characters, Thermo is the midpoint value, and its uncertainty is expressed by Thermodelta.

Thermodelta4-by-3 matrix containing the differences between Thermo (midpoint value) and either the maximum or minimum value Thermo could assume for each calculation and method. Therefore, Thermodelta defines the possible range of thermodynamic values for SeqNT.


