Big Data Analysis with Scala and Spark
Paid Course - Subscription
$49/per month
Big Data Analysis with Scala and Spark
Rate The Course:
[Total: 0 Avg Rating: 0]
Highest rated
Audit option: Yes
Intermediate
Free trial availability: Yes
Instructor Type: Institution backed
Certification availability: Yes
Adjustable deadlines
~15 hours
Course taught in: English
Captions availability: English
Share on facebook
Facebook
Share on twitter
Twitter
Share on linkedin
LinkedIn
Share on whatsapp
WhatsApp
Share on reddit
Reddit
Share on google
Google+
Share on print
Print

About Course 

Manipulating big data distributed over a cluster using functional concepts is rampant in industry, and is arguably one of the first widespread industrial uses of functional ideas. This is evidenced by the popularity of MapReduce and Hadoop, and most recently Apache Spark, a fast, in-memory distributed collections framework written in Scala. In this course, we’ll see how the data parallel paradigm can be extended to the distributed case, using Spark throughout. We’ll cover Spark’s programming model in detail, being careful to understand how and when it differs from familiar programming models, like shared-memory parallel collections or sequential Scala collections. Through hands-on examples in Spark and Scala, we’ll learn when important issues related to distribution like latency and network communication should be considered and how they can be addressed effectively for improved performance.

Learning Outcomes. By the end of this course you will be able to:

– read data from persistent storage and load it into Apache Spark,
– manipulate data with Spark and Scala,
– express algorithms for data analysis in a functional style,
– recognize how to avoid shuffles and recomputation in Spark

Online Degrees 

No online degree found

Platform