Back Issues This Week → Current Issue → Popular →

All issuesVolume 300, Issue 5IT NewsBig Data

What Is Apache Spark? The Big Data Platform That Crushed Hadoop

InfoWorld, March 30th, 2023

Fast, flexible, and developer-friendly, Apache Spark is the leading platform for large-scale SQL, batch processing, stream processing, and machine learning.

Apache Spark is a data processing framework that can quickly perform processing tasks on very large data sets, and can also distribute data processing tasks across multiple computers, either on its own or in tandem with other distributed computing tools. These two qualities are key to the worlds of big data and machine learning, which require the marshalling of massive computing power to crunch through large data stores. Spark also takes some of the programming burdens of these tasks off the shoulders of developers with an easy-to-use API that abstracts away much of the grunt work of distributed computing and big data processing.

more →  ·  More from Big Data →