How to read text files form sub-sub folders

1 visualización (últimos 30 días)
Mekala balaji
Mekala balaji el 4 de Oct. de 2017
Comentada: Mekala balaji el 17 de Oct. de 2017
Hi,
I want to read text files from sub-sub folders:
Architecture:
Mainfolder
Tool1
sub-subFolder1
sub-subFolder2
.....
.....
Tool2
sub-subFolder1
sub-subFolder2
.....
.....
......
1. Read text files by each sub-folder(i.e, Tool1, Tool2, etc)
2. Output
Tool1.xlsx, Tool2.xlsx
I use the following code, but I can access sub-sub folders.
% - Define output header.
header = {'RainFallID', 'IINT', 'Rain Result', 'Start Time', 'Param1.pipe', ...
'10 Un Para2.pipe', 'Verti 2 mixing.dis', 'Rate.alarm times'} ;
Mainfolder='Mainfolder';
outLocatorFolder='OutputFolder';
nHeaderCols = numel( header ) ;
% - Build listing sub-folders of main folder.
% D_main = dir( 'D:\Mekala_Backupdata\Matlab2010\Mainfolder' ) ;
D_main = dir(Mainfolder ) ;
D_main = D_main(3:end) ; % Eliminate "." and ".."
% - Iterate through sub-folders and process.
for dId = 1 : numel( D_main )
% - Build listing files of sub-folder.
D_sub = dir( fullfile(Mainfolder, D_main(dId).name, '*.txt' )) ;
nFiles = numel( D_sub ) ;
keyboard
% - Prealloc output cell array.
data = cell( nFiles, nHeaderCols ) ;
% - Iterate through files and process.
for fId = 1 : nFiles
% - Read input text file.
inLocator = fullfile(Mainfolder, D_main(dId).name, D_sub(fId).name ) ;
content = fileread( inLocator ) ;
% - Extract relevant data.
rainfallId = str2double( regexp( content, '(?<=RainFallID\s+:\s*)\d+', 'match', 'once' )) ;
iint = regexp( content, '(?<=IINT\s+:\s*)\S+', 'match', 'once' ) ;
rainResult = regexp( content, '(?<=Rain Result\s+:\s*)\S+', 'match', 'once' ) ;
startTime = strtrim( regexp( content, '(?<=Start Time\s+:\s*).*?(?= -)', 'match', 'once' )) ;
param1Pipe = str2double( regexp( content, '(?<=Param1.pipe\s+[\d\.]+\s+\w+\s+)[\d\.]+', 'match', 'once' )) ;
tenUn = str2double( regexp( content, '(?<=10 Un Para2.pipe\s+[\d\.]+\s+\w+\s+)[\d\.]+', 'match', 'once' )) ;
verti2 = regexp( content, '(?<=Verti 2 mixing.dis\s+\S+\s%\s+)\S+', 'match', 'once' ) ;
rateAlarm = strtrim( regexp( content, '(?<=Rate.alarm times\s+\S+\s+)[^\r\n]+', 'match', 'once' )) ;
% - Populate data cell array.
data(fId,:) = {rainfallId, iint, rainResult, startTime, ...
param1Pipe, tenUn, verti2, rateAlarm} ;
end
% - Output to XLSX.
% outLocator = fullfile( 'D:\Mekala_Backupdata\Matlab2010\OutputFolder', sprintf( '%s.xlsx', D_main(dId).name )) ;
outLocator = fullfile(outLocatorFolder, sprintf( '%s.xlsx', D_main(dId).name )) ;
fprintf( 'Output XLSX: %s ..\n', outLocator ) ;
xlswrite( outLocator, [header; data] ) ;
end
many thanks in advance,

Respuesta aceptada

Image Analyst
Image Analyst el 4 de Oct. de 2017
You need to use in dir() instead of *. See attached demo.

Más respuestas (1)

Cedric
Cedric el 4 de Oct. de 2017
Editada: Cedric el 4 de Oct. de 2017
Look at the EDIT 4:09pm block in the thread:
update the pseudo-code
Iterate through sub folders of 'Mainfolder'
Iterate through files of sub folder
Extract data from file and store in data array
Export data array to relevant Excel file
specifically for your new problem, and it should show you how to restructure and update the former code. At first remove all the code that is not necessary to crawling through the folders and files, and run it to check that it is crawling as desired.
Big hint: you should be able to add a level of FOR loop. Define D_sub at a strategic place:
for dmId = 1 : numel( D_main )
D_sub = dir( fullfile( Mainfolder, D_main(dmId).name )) ;
D_sub = D_sub(3:end) ; % Eliminate "." and ".."
iterate through its elements (sub-sub-folders):
for dsId = 1 : numel( D_sub )
D_subsub = dir( fullfile( Mainfolder, D_main(dmId).name, D_sub(dsId).name, '*.txt' )) ;
nFiles = numel( D_subsub ) ;
and finally iterate through D_subsub elements (the text files):
for fId = 1 : nFiles
inLocator = fullfile( Mainfolder, D_main(dmId).name, D_sub(dsId).name, D_subsub(fId).name ) ;
content = fileread( inLocator ) ;
Note that if you have a recent version of MATLAB, you can replace most calls to FULLFILE by the value of the folder field of the relevant output of a former DIR, e.g.:
inLocator = fullfile( Mainfolder, D_main(dmId).name, D_sub(dsId).name, D_subsub(fId).name ) ;
could be replaced by:
inLocator = fullfile( D_subsub(fId).folder, D_subsub(fId).name ) ;
Finally, note that if you have a lot of different situations with varying depths of nested folders, a better approach would be to build a recursive crawler, but this is a bit more complex.
  4 comentarios
Cedric
Cedric el 4 de Oct. de 2017
Editada: Cedric el 4 de Oct. de 2017
You should index D_main with dmId when you generate the output locator. When I wrote the hints above with an additional level of loop, I changed the name of the loop index variables to make them more consistent: dmId for "dir main ID" and dsId for "dir sub ID".
Mekala balaji
Mekala balaji el 17 de Oct. de 2017
Thanks Sir,

Iniciar sesión para comentar.

Categorías

Más información sobre Data Type Conversion en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by