Replacing characters with integers in a very long string
6 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Paolo Binetti
el 17 de Dic. de 2016
Comentada: Star Strider
el 18 de Dic. de 2016
I have a string of a few millions characters, want to replace it with a vector of integers according to simple rules, such as 'C' = -1 and so forth. My implementation works but takes forever and uses gigabytes of memory, in particular due to the str2num function, to my understanding. Is there a way to go more efficiently?
sequence = fileread('sourcefile.txt');
sequence_num = strrep(sequence, 'A', '0 ');
sequence_num = strrep(sequence_num,'C','-1 ');
sequence_num = strrep(sequence_num,'G', '1 ');
sequence_num = strrep(sequence_num,'T', '0 ');
sequence_num = regexprep(sequence_num,'\r\n','');
sequence_num = str2num(sequence_num);
sequence_num = int32(sequence_num);
0 comentarios
Respuesta aceptada
Star Strider
el 17 de Dic. de 2016
I don’t know what structure ‘sequence’ has. I created it as a cell array here:
bases = {'A','C','T','G'}; % Cell Array
sequence = bases(randi(4, 1, 20)); % Create Data
skew = zeros(1, length(sequence)+1,'int32'); % Preallocate
Cix = find(ismember(sequence, 'C')); % Logical Vector
Gix = find(ismember(sequence, 'G')); % Logical Vector
skew(Cix+1) = -1; % Replace With Integer
skew(Gix+1) = +1; % Replace With Integer
7 comentarios
Star Strider
el 18 de Dic. de 2016
Our pleasure!
It is always more gratifying to help with real-world research. We wish you well!
Más respuestas (0)
Ver también
Categorías
Más información sobre Characters and Strings en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!