*7.2.5 Deployment*

Executing heavy processing on a cluster, controlling the slave nodes, distributing the tasks for them fairly, and arbitrating the amount of CPU and memory that will be allocated to each process, this is the role of a cluster manager. Spark currently offers three solutions for this: Spark standalone, YARN and Mesos. Comes with Spark,

<sup>3</sup> Spark Programming Guide-Spark 1.2.0 Documentation. [Online]. Available: http://spark.apache.org/ docs/1.2.0/programming-guide.html

## *Multimedia Information Retrieval*

Spark Standalone is the easiest way to set up. This cluster manager relies on Akka for exchanges and on Zookeeper to guarantee the high availability of the master node. It has a console to supervise processing, and a mechanism to collect logs from slaves.

Alternatively, YARN the Hadoop cluster manager, Spark can run on it, and alongside Hadoop jobs. Finally, more sophisticated and more general, Mesos allows you to configure more finely the allocation of resources (memory, CPU) to different applications.
