How can I manipulate a substantial number of NetCDF files with variables of large dimensions to produce statistics?

3 visualizaciones (últimos 30 días)
Hello,
Wondering if anyone could guide me to find a better approach to manipulate a substantial number of NetCDF files with variables of large dimensions to produce statistics.
I have 8000 NetCDF files of ~6500 KB each. In each individual file there is a variable of my interest with the same dimensions (2048x4096 pixels). My intention is to calculate the mean, std, trend, etc, for each pixel to create 2D matrices with the statistical results.
I am wondering if there is an efficient way to manipulate these large NetCDF files that can allow me to open each of the 8000 NetCDF files through a loop, generate a time series for each i,j, calculate statistics to those time series, and produce a final 2D matrix with the statistical results?
Here are my attempts, which are not much efficient and functional for this purpose. Your suggestions are very much appreciated.
My first idea was to create a large matrix by running a loop for all NetCDF files, but didn't work due to system out-of-memory, which I expected to happen but still tried.
clear all, close all
projectdir ='C\:NetcdfFiles';
dinfo = dir( fullfile(projectdir, '*.nc') );
num_files = length(dinfo);
filenames = fullfile( projectdir, {dinfo.name} );
test1=ncread('File1.nc', 'DATA');
NEW_DATA = nan(size(test1,1), size(test1,2), num_files); clear test1
   
tic
for K = 1 : num_files
    this_file = filenames{K};
    out = ncread(this_file, 'DATA');
    NEW_DATA(:,:,K)=out;
end
toc
My second idea was to calculate statistics from subareas of each NetCDF files. Specifically divide the data (matrix) from each NetCDF file into small regions like 100x100 pixels (until complete the global area) for the 2048 x 4096 pixels for the 8000 NetCDF files. This is coming tedious, time-consuming, inefficient, and I am having the same out-of-memory issues, still after running multiple loops to complete this task.
clear all, close all
projectdir = 'C\:NetcdfFiles';
dinfo = dir( fullfile(projectdir, '*.nc') );
num_files = length(dinfo);
filenames = fullfile( projectdir, {dinfo.name} );
tic
for K = 1 : num_files
this_file = filenames{K};
DATA = ncread(this_file, 'DATA');
DATA1(:,:,K)=DATA(1:100,1:100); %extracting subareas
DATA2(:,:,K)=DATA(101:200,1:100); %extracting subareas
....
DATA???(:,:,K)=DATA(1948:2048,3996:4096); %extracting subareas
end
toc
Thank you for your help.

Respuestas (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by