Problem reading datafile using textscan and organizing them.
Mostrar comentarios más antiguos
I have a data profile stored as txt, there are about 50000 sections and the basic structure of each section is like this:
#-------------------------------------------
CAST ,, 8741944,WOD
NODC Cruise ID ,,SU-14378 ,,,
Year ,, 1862,,,
VARIABLES ,Depth ,F,O,Temperatur,F,O,,
UNITS ,m , , ,degrees C , , ,,
Prof-Flag , ,0, , ,0, ,,
1, 0.,0, , 26.10,0, ,
END OF VARIABLES SECTION,
#-----------------------------,
So there are two parts of info, one is the basics and the other part is data. However, the length of each part are changing, sometimes there are more informations. Also in the data part, the order of is changing and sometimes there isn't data part. Sometimes temperatur is in the first column and sometimes the third. I want to get the information of cast and the data of depth and temperatur out
I am using the textscan to read all of them at once, but since there are about 2million lines, matlab get killed each time I run it since there is a memory limitation in my computer. I am wondering whether there is better way to solve the problem? I just starts to use the matlab, so I think there might be a lot of problem about the script I wrote.
clear all
fid=fopen('wodall.txt','r');
WOD=textscan(fid,'%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s
%s%s%s%s%s%s%s%s%s',1954993,'delimiter', ',');
startpoint=find(strncmpi('#',WOD{1},1));
variableline=find(strncmpi('variables',WOD{1},3));
dataline=1+find(strncmpi('prof',WOD{1},4));
endpoint=find(strncmpi('end of variables section',WOD{1},12));
nfiles=length(startpoint)-1;
%find the position of thing I want to record in the file and count how many sections are there
castn=NaN(1,nfiles);dep=NaN(1,nfiles,33);tem=NaN(1,nfiles,33);
%creat the matrics to record the data
count=0 %to record those that do not have data part
for i=1:nfiles;
castn(i)=str2double(WOD{3}(startpoint(i)+1));
%store the basic information
a=dataline(i-count);
b=endpoint(i-count)-1;
c=variableline(i-count);
d=startpoint(i+1);
%the count of section minus the count of the parts do not contain the data part.
if (a<d) %store the data if there is data part.
count=count;
DATA=str2double(cat(2,WOD{1}(a:b),WOD{2}(a:b),WOD{3}(a:b),WOD{4}(a:b)...
,WOD{5}(a:b),WOD{6}(a:b),WOD{7}(a:b),WOD{8}(a:b),WOD{9}(a:b)...
,WOD{10}(a:b),WOD{11}(a:b),WOD{12}(a:b),WOD{13}(a:b),WOD{14}(a:b)...
,WOD{15}(a:b),WOD{16}(a:b),WOD{17}(a:b),WOD{18}(a:b),WOD{19}(a:b)...
,WOD{20}(a:b),WOD{21}(a:b),WOD{22}(a:b),WOD{23}(a:b),WOD{24}(a:b)...
,WOD{25}(a:b),WOD{26}(a:b),WOD{27}(a:b),WOD{28}(a:b),WOD{29}(a:b)...
,WOD{30}(a:b),WOD{31}(a:b),WOD{32}(a:b),WOD{33}(a:b),WOD{34}(a:b)...
,WOD{35}(a:b)));
HEADLINE=cat(2,WOD{1}(c),WOD{2}(c),WOD{3}(c),WOD{4}(c),WOD{5}(c),...
WOD{6}(c),WOD{7}(c),WOD{8}(c),WOD{9}(c),WOD{10}(c),WOD{11}(c),...
WOD{12}(c),WOD{13}(c),WOD{14}(c),WOD{15}(c),WOD{16}(c), WOD{17}(c),...
WOD{18}(c),WOD{19}(c) ,WOD{20}(c),WOD{21}(c),WOD{22}(c),WOD{23}(c),...
WOD{24}(c),WOD{25}(c),WOD{26}(c),WOD{27}(c),WOD{28}(c), WOD{29}(c),...
WOD{30}(c),WOD{31}(c),WOD{32}(c),WOD{33}(c),WOD{34}(c),WOD{35}(c));
sizedata=size(DATA);
lengthmore=33-sizedata(1);
MAKEUP=NaN(lengthmore,35);
DATA=cat(1,DATA,MAKEUP);
DD=0;TT=0;
DD=find(strncmpi('depth',HEADLINE,4));
TT=find(strncmpi('temperatur',HEADLINE,4));
if DD>0
dep(1,i,:)=DATA(:,DD); end
if TT>0
tem(1,i,:)=DATA(:,TT); end
% store the data if they appears
else count=count+1;
end
end
fclose(fid);
save('');
I hope my description is okay. Thanks a lot for reading all these and helping.
Respuestas (0)
Categorías
Más información sobre Whos en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!