int2nt
Convert nucleotide sequence from integer to letter representation
Syntax
SeqChar = int2nt(SeqInt)
SeqChar = int2nt(SeqInt,
...'Alphabet', AlphabetValue, ...)
SeqChar = int2nt(SeqInt,
...'Unknown', UnknownValue, ...)
SeqChar = int2nt(SeqInt,
...'Case', CaseValue, ...)
Input Arguments
SeqInt | Row vector of integers specifying a nucleotide sequence. For valid integers, see the table Mapping Nucleotide Integers to Letter Codes. Integers are arbitrarily assigned to IUB/IUPAC letters. |
AlphabetValue | Character vector or string specifying a nucleotide alphabet. Choices are:
|
UnknownValue | Character to represent unknown nucleotides, that is 0 or
integers ≥ 17. Choices are any character
other than the nucleotide characters A, C, G, T,
and U and the ambiguous nucleotide characters N, R, Y, K, M, S, W, B, D, H,
and V. Default is *. |
CaseValue | Character vector or string specifying the upper or lower case. Choices are
'upper' (default) or 'lower'. |
Output Arguments
SeqChar | Nucleotide sequence specified by a character vector of codes. |
Description
converts SeqChar = int2nt(SeqInt)SeqInt,
a row vector of integers specifying a nucleotide sequence, to SeqChar,
a character vector of codes specifying the same nucleotide sequence.
For valid codes, see the table Mapping Nucleotide Integers to Letter Codes.
Mapping Nucleotide Integers to Letter Codes
| Nucleotide | Integer | Code |
|---|---|---|
| Adenosine | 1 | A |
| Cytidine | 2 | C |
| Guanine | 3 | G |
| Thymidine | 4 | T |
Uridine (if 'Alphabet' set to 'RNA') | 4 | U |
Purine (A or G) | 5 | R |
Pyrimidine (T or C) | 6 | Y |
Keto (G or T) | 7 | K |
Amino (A or C) | 8 | M |
Strong interaction (3 H bonds) (G or C) | 9 | S |
Weak interaction (2 H bonds) (A or T) | 10 | W |
Not A (C or G or T) | 11 | B |
Not C (A or G or T) | 12 | D |
Not G (A or C or T) | 13 | H |
Not T or U (A or C or G) | 14 | V |
Any nucleotide (A or C or G or T or U) | 15 | N |
| Gap of indeterminate length | 16 | - |
| Unknown (any integer not in table) | 0 or ≥ 17 | * (default) |
calls SeqChar = int2nt(SeqInt,
...PropertyName', PropertyValue,
...)int2nt with optional properties
that use property name/property value pairs. You can specify one or
more properties in any order. Each PropertyName must
be enclosed in single quotation marks and is case insensitive. These
property name/property value pairs are as follows:
specifies
a nucleotide alphabet. SeqChar = int2nt(SeqInt,
...'Alphabet', AlphabetValue, ...)AlphabetValue can
be 'DNA', which uses the symbols A, C, G,
and T, or 'RNA', which uses
the symbols A, C, G,
and U. Default is 'DNA'.
specifies
the character to represent unknown nucleotides, that is SeqChar = int2nt(SeqInt,
...'Unknown', UnknownValue, ...)0 or
integers ≥ 17. UnknownValue can
be any character other than the nucleotide characters A, C, G, T,
and U and the ambiguous nucleotide characters N, R, Y, K, M, S, W, B, D, H,
and V. Default is *.
specifies
the upper or lower case. SeqChar = int2nt(SeqInt,
...'Case', CaseValue, ...)CaseValue can
be 'upper' (default) or 'lower'.
Examples
Convert a nucleotide sequence from integer to letter representation.
s = int2nt([1 2 4 3 2 4 1 3 2]) s = ACTGCTAGC
Convert a nucleotide sequence from integer to letter representation and define
#as the symbol for unknown numbers17and greater.si = [1 2 4 20 2 4 40 3 2]; s = int2nt(si, 'unknown', '#') s = ACT#CT#GC
Version History
Introduced before R2006a
See Also
aa2int | baselookup | int2aa | nt2int