Become a Data Engineer
Data Engineering is the foundation for the new world of Big Data. Enroll now to build production-ready data infrastructure, an essential skill for advancing your data career.
ESTIMATED TIME, 5 Months
At 5 hrs/week
ENROLL BY, August 14, 2019
Get access to classroom immediately on enrollment
Intermediate Python & SQL
Intermediate Python programming knowledge, of the sort gained through the Programming for Data Science Nanodegree program, other introductory programming courses or programs, or additional real-world software development experience. Including:
- Strings, numbers, and variables; statements, operators, and expressions;
- Lists, tuples, and dictionaries; Conditions, loops;
- Procedures, objects, modules, and libraries;
- Troubleshooting and debugging; Research & documentation;
- Problem solving; Algorithms and data structures
This content is also available in the Introduction to Python Programming course.
Intermediate SQL knowledge and linear algebra mastery, addressed in the Programming for Data Science Nanodegree program, including:
- Joins, Aggregations, and Subqueries
- Table definition and manipulation (Create, Update, Insert, Alter)
This content is also available in the SQL for Data Analysis course.
BUILT IN COLLABORATION WITH
What You Will Learn
Learn to design data models, build data warehouses and data lakes, automate data pipelines, and work with massive datasets. At the end of the program, you’ll combine your new skills by completing a capstone project.
5 months to complete
To be successful in this program, you should have intermediate Python and SQL skills.See detailed requirements.
Learn to create relational and NoSQL data models to fit the diverse needs of data consumers. Use ETL to build databases in PostgreSQL and Apache Cassandra.
DATA MODELING WITH POSTGRESDATA MODELING WITH APACHE CASSANDRA
Cloud Data Warehouses
Sharpen your data warehousing skills and deepen your understanding of data infrastructure. Create cloud-based data warehouses on Amazon Web Services (AWS).
BUILD A CLOUD DATA WAREHOUSE
Spark and Data Lakes
Understand the big data ecosystem and how to use Spark to work with massive datasets. Store big data in a data lake and query it with Spark.
BUILD A DATA LAKE
Data Pipelines with Airflow
Schedule, automate, and monitor data pipelines using Apache Airflow. Run data quality checks, track data lineage, and work with data pipelines in production.
DATA PIPELINES WITH AIRFLOW
Combine what you’ve learned throughout the program to build your own data engineering portfolio project.
DATA ENGINEERING CAPSTONE