*7.2.6 Components of Spark*

Because Spark's core engine is both fast and versatile, it powers multiple specialized high-level components for various workloads, such as SQL or machine learning. These components allow you to combine them like libraries in a software project. Spark Core: Contains the basic functionality of Spark, including components for job scheduling, memory management, disaster recovery, interaction with storage systems, and more. Spark Core is also the API that defines Elastic Distributed Datasets (RDDs), which are the main programming abstractions in Spark. RDDs represent a collection of objects distributed over several compute nodes that can be manipulated in parallel. Spark Core offers many APIs for building and manipulating these collections.

Other than Spark Core API, there are additional libraries that are part of the Spark ecosystem and provide additional capabilities in big data analysis 6. These libraries are: Spark streaming, Spark SQL, Spark MLlib, Spark GraphX (**Figure 6**) [32].
