Transfer Data with Job Methods and Properties
To transfer data to the cloud cluster, you can use the AttachedFiles
or
JobData
properties, as you would for other clusters. For example:
Place all required executable and data files in the same folder.
Specify that folder in the
AttachedFiles
property of the job.
Submitting your job transfers the files to the cloud and makes them available to the workers running on the cloud cluster.
Data stored in job and task properties is available to the client. Therefore, your task or
batch function results are accessible from the finished job’s fetchOutputs
function or the tasks’ OutputArguments
property. For batch jobs running on
the cloud, access the job’s workspace variables with the load
function in
your client session.
For example, the following sections show you how to run a batch job with files on your
machine and a function divideData
on clusters in Cloud Center and obtain
the results of the computation.
Prepare Example
Copy the required data for this example to your current working folder by opening the
supporting function prepareSupportingFiles
and using the code
inside.
openExample("parallel/RunBatchJobAndAccessFilesFromWorkersExample", ... supportingFile="prepareSupportingFiles.m")
Your current working folder now contains 4 files: A.dat
,
B1.dat
, B2.dat
, and B3.dat
.
Run Batch Job
Create and discover your Cloud Center profile on MATLAB. Set it as your default cluster profile. For more details, see Create and Discover Clusters.
Create a cluster object using parcluster
(Parallel Computing Toolbox).
c = parcluster;
batch
(Parallel Computing Toolbox). Use
the AttachedFiles
name-value pair argument to transfer files from your
local machine to the workers. For example, use a parallel pool with three workers and
offload the computations in divideData
function.filenames = "B" + string(1:3) + ".dat"; job = batch(c,@divideData,1,{}, ... Pool=3, ... AttachedFiles=filenames);
To block MATLAB until the job completes, use the wait
function on the
job object.
wait(job);
Retrieve Results and Clean Up Data
To retrieve the results of a batch job, use the fetchOutputs
function. fetchOutputs
returns a cell array with the outputs of the
function run with batch.
X = fetchOutputs(job)
X = 1×1 cell array {40×207 double}
You can also access the job’s workspace variables with the load
(Parallel Computing Toolbox)
function.
When you have retrieved all the required outputs and do not need the job object anymore, delete it to clean up its data and avoid consuming resources unnecessarily.
delete(job)
clear job
For more details, see Run Batch Job and Access Files from Workers (Parallel Computing Toolbox).