Skip to content
  • Bangalore
  • +91-80-39-511-501
  • reachus@coursezoo.com
CourseZoo.com

CourseZoo.com

Hands-On Learning For All

CourseZoo.com

close
  • Courses
    • Big Data with Apache Spark Course
    • Big Data with Hadoop Course
  • Workshops
    • Automation Anywhere – RPA Training
    • AWS Certification Training With Live Hands-On
    • Apache Spark & Hive Hands-On Workshop
  • Projects
  • Blog

Big Data with Apache Spark Course

Big Data with Apache Spark Course

Hands-On course on Big Data, Apache Spark core & advance

  •   Full access to Online Lab – A browser based lab for hands-On learning
  •   Full access to projects – Spark streaming, Kafka…etc
  •   1-on-1 support during & after the course
  •   Industry expert trainer

This course is best suited for those who want to gain Big Data analytics skills to:

  • Analyze huge datasets quickly
  • Build, deploy and run Spark applications on Spark clusters
  • Process continual streams of data with Spark Streaming
  • Frame big data analysis problems as Apache Spark scripts
  • Develop distributed code using the Scala programming language
  • Optimize Spark jobs through partitioning, caching, and other techniques
  • Transform structured data using SparkSQL and DataFrames
  • Traverse and analyze graph structures using GraphX

Tools Covered:

Apache Spark Tools Covered

Course Outline

Introduction to Big Data & Apache Spark
  • Introduction of Big Data
  • Introduction & a brief history of Apache Spark
  • Components of Apache Spark unified stack
  • Who uses Apache Spark?
Getting started with Apache Spark
  • Downloading & installing Apache Spark
  • Running the examples & shell (Python & Scala)
  • Introduction to core Apache Spark concepts
Introduction to Online Lab
  • What is Online Lab
  • Components of Online Lab
  • Logging into Online Lab
  • First Hands-On using Online Lab
Understanding resilient distributed datasets (RDD)
  • RDD basics
  • Creating RDDs
  • Working with RDD operations
  • Passing functions to Apache Spark
  • Common transformations and actions
  • Persistence (caching)
Working with key/value pairs
  • Creating pair RDDs
  • Transformations on pair RDDs
  • Actions available on pair RDDs
  • Data partitioning (advanced)
Loading and saving your data
  • Various file formats & file systems
  • Structured data with Apache Spark SQL
  • Databases
Advanced Apache Spark programming
  • Introduction
  • Accumulators
  • Broadcast variables
  • Working on a per-partition basis
  • Piping to external programs
  • Numeric RDD operations
Running Apache Spark on a cluster
  • Introduction
  • Apache Spark runtime architecture
  • Submitting applications with Apache Spark-submit
  • Packaging your code and dependencies
  • Scheduling within and between Apache Spark applications
  • Cluster managers
Introduction to Apache Spark libraries
  • Understanding Apache Spark SQL
  • Using Apache Spark SQL in applications
  • Machine learning basics
  • Machine learning with MLlib, GraphX
Apache Spark streaming
  • A simple example
  • Architecture and abstraction
  • Transformations & output operations
  • Input sources
  • Streaming UI
  • Performance considerations
  • Kafka basics

Attend a demo class

just fill in the details to schedule a demo




Contact Us

(+91) 80-39-511-501

Courses

  • Big Data with Apache Spark Course
  • Big Data with Hadoop Course

Workshops

  • Automation Anywhere – RPA Training
  • AWS Certification Training With Live Hands-On
  • Apache Spark Workshop With Projects Hands-On

Resources

  • Projects
  • Blog

Quick Links

  • Corporate Training
  • College Tie Ups
  • Contact Us
  • Privacy Policy
© Copyright 2019 CourseZoo.com. All rights reserved.

CourseZoo.com

close
  • Courses
    • Big Data with Apache Spark Course
    • Big Data with Hadoop Course
  • Workshops
    • Automation Anywhere – RPA Training
    • AWS Certification Training With Live Hands-On
    • Apache Spark & Hive Hands-On Workshop
  • Projects
  • Blog