parallel processing for readtable function
13 views (last 30 days)
ps = parallel.Settings;
ps.Pool.AutoCreate = false; %do not autocreate parpool when encountering a |parfor|
ps.Pool.IdleTimeout = Inf; %do not shutdown parpool after Inf idle time
When using 10 files with a size of about 100mb, only a couple of second differences are observed between for and parfor. Is there something wrong with the above steps for parfor. Is there any more convenient parallel processing approach that can be applied?
If the multiple "full_file_name(j,:)" can be read simultaneously (using multiple core) rather than sequentially, the speed of readtable can be increased significantly.
Edric Ellis on 19 Aug 2021
You've manually disabled the AutoCreate for parallel pools - I presume you're manually creating a pool with a separate parpool statement.
Whether or not parfor can go faster than a serial for loop depends on your underlying hardware for this case. It might simply be that the limiting factor is your disk drive, and trying to read from it multiple times simultaneously does not allow any speedup.
You could try manually running multiple copies of MATLAB and calling readtable from each of them simultaneously to see if they slow down when there is contention for disk access. I.e. run something like this in multiple copies of MATLAB:
t = tic();
I suspect that you will see that as you start more copies of MATLAB, the toc time will increase.