Course Outline
Introduction
Overview of Data Access Methods (Hive, databases, etc.)
Overview of Spark Features and Architecture
Installing and Configuring Spark
Understanding Spark Dataframes
Defining Tables and Importing Datasets
Querying DataFrames using SQL
Performing Aggregations, JOINs, and Nested Queries
Uploading and Accessing Data
Querying Different Data Types
- JSON, Parquet, etc.
Querying Data Lakes with SQL
Troubleshooting
Summary and Conclusion
Requirements
- Experience with SQL queries
- Programming proficiency in any language
Target Audience
- Data analysts
- Data scientists
- Data engineers
Testimonials (3)
The exercises and Q&A sessions
Antoine - Physiobotic
Course - Scaling Data Pipelines with Spark NLP
Machine Translated
I liked that it was practical. Loved to apply the theoretical knowledge with practical examples.
Aurelia-Adriana - Allianz Services Romania
Course - Python and Spark for Big Data (PySpark)
The fact that we were able to take with us most of the information/course/presentation/exercises done, so that we can look over them and perhaps redo what we didint understand first time or improve what we already did.