Help!!! how to search for some xx xx xx xx(hex) in a dat file very fast!!!
6 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Eric Jiang
el 29 de Jun. de 2019
Comentada: Eric Jiang
el 3 de Jul. de 2019
Help!!! I have a dat file, about 40MB,
I want to search for xx xx xx xx (hex),
I can do it using for or while loop, but it's too slow because of 40 million Bytes !
how to speed up,thanks!
2 comentarios
Respuesta aceptada
Guillaume
el 30 de Jun. de 2019
Editada: Guillaume
el 30 de Jun. de 2019
Unlike per isakson, I'm assuming that you're looking for a byte pattern (given in hexadecimal format) in a binary file. If you're looking for a pattern of hexadecimal characters in a text file see per's answer.
%input
hexpattern = ['41'; 'AB'; 'FF'; '7E']; %you haven't specified how this is stored. Taking a guess
filetosearch = 'C:\somewhere\somefolder\somefile.dat'; %doesn't have to have .dat extension
%read file
fid = fopen(filetosearch, 'r');
assert(fid > 0, 'Failed to open file. Most likely the wrong path was specified');
filecontent = fread(fid, [1 Inf], '*uint8'); %read all bytes at once
fclose(fid);
%pattern search
patternvalues = hex2dec(hexpattern);
patternlocation = strfind(filecontent, patternvalues); %despite its name strfind also works for numbers
sprintf('Hex pattern was found at byte(s) %s', strjoin(compose('%d', patternlocation), ', '));
edited as I got per isakson and dpb mixed up
Más respuestas (2)
per isakson
el 30 de Jun. de 2019
Editada: per isakson
el 30 de Jun. de 2019
Your question is very vaque and leaves room for interpretation.
I assume that dat-file is an ordinary text file. I cannot guess in what form you want the hex-strings, which are found.
However, I made a little test
- created a 10MB text file, cssm.txt
- created a script, cssm.m
%%
tic
txt = fileread( 'cssm.txt' );
toc
%%
tic
cac = regexp( txt, '([0-9A-F]{2} ){3}[0-9A-F]{2}', 'match' );
toc
- ran cssm
Elapsed time is 0.133106 seconds.
Elapsed time is 0.357219 seconds.
- and peeked at the result
>> cac{[1,2,3601]}
ans =
'01 23 45 67'
ans =
'89 AB CD EF'
ans =
'01 23 45 67'
>>
I doubt that you can do it significantly faster with plain Matlab on a standard desktop PC
Triggered by Guillaume's answer: To get the locations of the hex-strings replace
cac = regexp( txt, '([0-9A-F]{2} ){3}[0-9A-F]{2}', 'match' );
by
[cac,loc] = regexp( txt, '([0-9A-F]{2} ){3}[0-9A-F]{2}', 'match', 'start' );
and peek
>> loc([1,2,3601])
ans =
33 2793 9936083
0 comentarios
dpb
el 30 de Jun. de 2019
Editada: dpb
el 30 de Jun. de 2019
If it's performance you're looking for, pass the job off to a grep utility...there are any number of freeware versions available for Windows if not one already installed on your system...
ADDENDUM
Altho seem to now recall there may be a FEX submission in mex form...I didn't search to see if really is, but suggest probably worth doing so...
0 comentarios
Ver también
Categorías
Más información sobre Data Import and Export en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!