Spark Applications

Create and execute compiled MATLAB® applications against Spark™ enabled Hadoop® clusters

Supported Platform: Linux® only.

You can deploy MATLAB applications against Spark in two ways:

  • Deploy tall arrays to a Spark enabled Hadoop cluster

  • Deploy applications using the MATLAB API for Spark

To deploy MATLAB applications that contain tall arrays, see Deploy Tall Arrays to a Spark Enabled Hadoop Cluster. To learn more about how to work with tall arrays, see Tall Arrays (MATLAB).

To deploy MATLAB applications that use functions such as flatMap, which is common in Spark programs, see Deploy Applications Using the MATLAB API for Spark.

The MATLAB API for Spark exposes the Spark programming model to MATLAB. Therefore, you will find Spark functions such as flatMap, mapPartitions, and aggregate that you can readily use when creating your MATLAB applications.


MATLAB applications developed using the MATLAB API for Spark cannot be deployed if they contain tall arrays.

See Apache Spark Basics for a short summary of Spark concepts and a discussion of how deployed MATLAB applications incorporate those concepts.

MATLAB has a vast collection of scientific and engineering algorithms and Spark is a fast and general-purpose engine for large-scale data processing. By deploying MATLAB applications against Spark, you can create applications in MATLAB and execute them against a Spark enabled cluster.

Supported Apache Spark™ Versions: 1.3–2.x.