From 167cd4409541d2332f584b9f0b28872db25c917c Mon Sep 17 00:00:00 2001 From: Tomasz Hamerla Date: Fri, 21 Oct 2022 15:24:34 +0200 Subject: [PATCH] Add content for spark and mapreduce (#2649) * Update 100-hadoop-spark-mapreduce.md * Update content/roadmaps/114-software-architect/content/109-working-with-data/100-hadoop-spark-mapreduce.md Co-authored-by: Kamran Ahmed --- .../109-working-with-data/100-hadoop-spark-mapreduce.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/content/roadmaps/114-software-architect/content/109-working-with-data/100-hadoop-spark-mapreduce.md b/content/roadmaps/114-software-architect/content/109-working-with-data/100-hadoop-spark-mapreduce.md index f7c6242b7..05de64c3c 100644 --- a/content/roadmaps/114-software-architect/content/109-working-with-data/100-hadoop-spark-mapreduce.md +++ b/content/roadmaps/114-software-architect/content/109-working-with-data/100-hadoop-spark-mapreduce.md @@ -1 +1,8 @@ -# Hadoop spark mapreduce \ No newline at end of file +# Spark, Hadoop MapReduce + +[Apache Spark](https://spark.apache.org/) is a data processing framework that can quickly perform processing tasks on very large data sets, and can also distribute data processing tasks across multiple computers, either on its own or in tandem with other distributed computing tools. + +Hadoop MapReduce is a software framework for easily writing applications which process vast amounts of data (multi-terabyte data-sets) in-parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner. + +Spark vs Hadoop MapReduce +Hadoop explained in 5 minutes