Table of Content
Getting Started
- The Course Overview
- Setting Up an AWS Account
- Launching a Spark Cluster on EC2
- Setting Up Your Environment
- Running a Test Application
Working with RDDs
- Creating RDDs
- Actions
- Transformations
- Joins, Set, and Numeric Operations
- Shared Variables
DataFrames
- Installing Jupyter Notebook
- RDDs and DataFrames
- DataFrame Row Operations
- DataFrame Column Operations
- DataFrame Manipulation
Spark SQL
- Views
- Schemas
- SQL Operations
- I/O Options
- HIVE
Machine Learning Fundamentals
- Basic Statistics
- Pipelines
- Feature Extractors
- Feature Transformers
- Feature Selectors
Machine Learning Models
- Classification
- Regression
- Clustering
- Collaborative Filtering
- Model Selection and Tuning
Streaming
- DStreams
- DStream Window Operations
- Structured Streaming
- Window Operations
- Joining Batch and Streaming Data
Apply for certification
https://www.vskills.in/certification/big-data/apache-spark-certificate