int2nt

Convert nucleotide sequence from integer to letter representation

Syntax

SeqChar = int2nt(SeqInt) SeqChar = int2nt(SeqInt, ...'Alphabet', AlphabetValue, ...) SeqChar = int2nt(SeqInt, ...'Unknown', UnknownValue, ...) SeqChar = int2nt(SeqInt, ...'Case', CaseValue, ...)

Input Arguments

`SeqInt`	Row vector of integers specifying a nucleotide sequence. For valid integers, see the table Mapping Nucleotide Integers to Letter Codes. Integers are arbitrarily assigned to IUB/IUPAC letters.
`AlphabetValue`	Character vector or string specifying a nucleotide alphabet. Choices are: `'DNA'` (default) — Uses the symbols `A`, `C`, `G`, and `T`. `'RNA'` — Uses the symbols `A`, `C`, `G`, and `U`.
`UnknownValue`	Character to represent unknown nucleotides, that is `0` or integers ≥ `17`. Choices are any character other than the nucleotide characters `A`, `C`, `G`, `T`, and `U` and the ambiguous nucleotide characters `N`, `R`, `Y`, `K`, `M`, `S`, `W`, `B`, `D`, `H`, and `V`. Default is `*`.
`CaseValue`	Character vector or string specifying the upper or lower case. Choices are `'upper'` (default) or `'lower'`.

Output Arguments

SeqChar Nucleotide sequence specified by a character vector of codes.

Description

SeqChar = int2nt(SeqInt) converts SeqInt, a row vector of integers specifying a nucleotide sequence, to SeqChar, a character vector of codes specifying the same nucleotide sequence. For valid codes, see the table Mapping Nucleotide Integers to Letter Codes.

Mapping Nucleotide Integers to Letter Codes

Nucleotide	Integer	Code
Adenosine	`1`	`A`
Cytidine	`2`	`C`
Guanine	`3`	`G`
Thymidine	`4`	`T`
Uridine (if `'Alphabet'` set to `'RNA'`)	`4`	`U`
Purine (`A` or `G`)	`5`	`R`
Pyrimidine (`T` or `C`)	`6`	`Y`
Keto (`G` or `T`)	`7`	`K`
Amino (`A` or `C`)	`8`	`M`
Strong interaction (3 H bonds) (`G` or `C`)	`9`	`S`
Weak interaction (2 H bonds) (`A` or `T`)	`10`	`W`
Not `A` (`C` or `G` or `T`)	`11`	`B`
Not `C` (`A` or `G` or `T`)	`12`	`D`
Not `G` (`A` or `C` or `T`)	`13`	`H`
Not `T` or `U` (`A` or `C` or `G`)	`14`	`V`
Any nucleotide (`A` or `C` or `G` or `T` or `U`)	`15`	`N`
Gap of indeterminate length	`16`	`-`
Unknown (any integer not in table)	`0` or ≥ `17`	`*` (default)

SeqChar = int2nt(SeqInt, ...PropertyName', PropertyValue, ...) calls int2nt with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:

SeqChar = int2nt(SeqInt, ...'Alphabet', AlphabetValue, ...) specifies a nucleotide alphabet. AlphabetValue can be 'DNA', which uses the symbols A, C, G, and T, or 'RNA', which uses the symbols A, C, G, and U. Default is 'DNA'.

SeqChar = int2nt(SeqInt, ...'Unknown', UnknownValue, ...) specifies the character to represent unknown nucleotides, that is 0 or integers ≥ 17. UnknownValue can be any character other than the nucleotide characters A, C, G, T, and U and the ambiguous nucleotide characters N, R, Y, K, M, S, W, B, D, H, and V. Default is *.

SeqChar = int2nt(SeqInt, ...'Case', CaseValue, ...) specifies the upper or lower case. CaseValue can be 'upper' (default) or 'lower'.

Examples

Convert a nucleotide sequence from integer to letter representation.
```
s = int2nt([1 2 4 3 2 4 1 3 2])

s =
ACTGCTAGC
```
Convert a nucleotide sequence from integer to letter representation and define # as the symbol for unknown numbers 17 and greater.
```
si = [1 2 4 20 2 4 40 3 2];
s = int2nt(si, 'unknown', '#')

s =
ACT#CT#GC
```

Version History

Introduced before R2006a