Data Science
with Spark
9 december, 2024 – Virtual
Apache Spark is a powerful, open-source processing engine built around speed, ease of use, and advanced analytics. This course teaches you to unlock its full potential and master this challenging tool.
Looking to upskill your team(s) or organization?
Rozaliia will gladly help you further with custom training solutions.
Get in touchDuur
3 dagen
Tijd
09:00 – 17:00
Taal
Engels
Lunch
Included
Certificering
Nee
Level
Advanced
What will you learn?
After the training, you will be able to:
Process large-scale data using PySpark.
Understand the fundamentals of Apache Spark.
Scale your machine learning workflows using PySpark.
Program
- Spark execution and Spark sessions.
- DataFrame methods, properties, and actions.
- APIs: (Py)Spark DataFrame vs Spark SQL.
- Reading and writing data in Spark.
This training is for you if:
You have worked with Python before and want to know how to scale to large datasets.
You have started, or are about to start, working with large data.
You know the concepts of machine learning and want to know how to apply them at scale.
This training is not for you if:
You won’t be working with Spark but want to learn Python (check out our Python for Data Analysis training instead).
You would like an introduction to machine learning (check out our Certified Data Science with Python course instead).
Why should I follow this training?
Learn the fundamentals of Apache Spark
Learn from the Spark experts
Learn to process large-scale data using PySpark and perform machine learning
What else
should I know?
After registering for this training, you will receive a confirmation email with practical information. A week before the training, we will ask you about any dietary requirements and share literature if you need to prepare.
See you soon!
Course information
All literature and course materials are included in the price.
After registering for this course, you will receive a confirmation email with practical information.