- Upon a little prodding, I went on to change the "DeflateLevel" to 0, which turns off the compression, and the sizes become identical for both the methods.
- With the above insight the reason for the behaviour could be due to writing in blocks. In the second file (test2.nc), you're writing the data in blocks of 100. This could potentially affect the compression efficiency. When you write data in smaller blocks, the compression algorithm has less context to efficiently compress the data compared to writing in one large block. This can result in larger file sizes because the algorithm cannot leverage patterns in the data over larger spans.
netCDF writing large files when created with data writing in finite blocks
2 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Dave Callaghan
el 8 de Abr. de 2022
Respondida: Gyan Vaibhav
el 1 de Feb. de 2024
The following code creates two netCDF containing the same thing (rand(30,N)), and on Windows 10 runing Matlab 2021b, the first netCDF file created is 103 MB while the second file, which contains the same information, written using 100 blocks, is 1200MB, and when I load from both files and test to see if they are difference, they are identical.
If I run these on linux, I get the same result for the first file written in one block of 103 MB, but the other file written in 100 blocks is 474 MB.
Any ideas on this would be most welcome.
fclose all;close all;clear;clc
ncFormat='netcdf4'; %classic 64bit netcdf4_classic netcdf4
fn='test1.nc';
N=1000000;
delete(fn)
nccreate(fn,'test','Dimensions',{'x' 30 'y' N},'Datatype','single','DeflateLevel',9,'Format',ncFormat);
test=rand(30,N);
ncwrite(fn,'test',single(test))
A=dir(fn);fprintf('size = %g MB\n',A(1).bytes/1024^2)
fn='test2.nc';
delete(fn)
nccreate(fn,'test','Dimensions',{'x' 30 'y' N},'Datatype','single','DeflateLevel',9,'Format',ncFormat);
n=N/100;
for i=1:n:N
ncwrite(fn,'test',single(test(:,i:i+n-1)),[1 i])
end
A=dir(fn);fprintf('size = %g MB\n',A(1).bytes/1024^2)
fprintf('files are same if result is zero: %g\n',sum(sum(abs(ncread('test1.nc','test')-ncread('test2.nc','test')))))
0 comentarios
Respuesta aceptada
Gyan Vaibhav
el 1 de Feb. de 2024
Hello Dave,
I came across your question and it's true what you have stated. However, I tried it on a Linux machine too, and it shows the sizes similar to windows, i.e. around 110 MB and 1200 MB for the respective methods.
PS - However, the size shouldn't have increased from the original, i.e. with 0 DeflateLevel.
Hope it gives some insight.
Thanks
0 comentarios
Más respuestas (0)
Ver también
Categorías
Más información sobre NetCDF en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!