Visão Geral
Este curso aprofunda os conceitos fundamentais do Apache Airflow, com foco específico em DAGs, Tasks e Operators, que formam a base de qualquer pipeline orquestrado. O aluno compreenderá como esses elementos se relacionam, como são definidos em Python e como impactam diretamente a confiabilidade, legibilidade e escalabilidade dos workflows. O curso é conceitual e prático, preparando o profissional para escrever DAGs bem estruturadas e alinhadas a boas práticas de engenharia de dados.
Conteúdo Programatico
Module 1: Airflow Core Concepts
- What is Apache Airflow
- Workflow orchestration fundamentals
- Airflow execution model
- Role of DAGs, tasks and operators
Module 2: Understanding DAGs
- What is a DAG
- DAG structure and files
- DAG parameters and configuration
- Scheduling and execution dates
Module 3: DAG Lifecycle and Behavior
- Parsing and loading DAGs
- DAG runs and task instances
- Execution timeline
- Catchup and backfill concepts
Module 4: Tasks in Airflow
- What is a task
- Task instances
- Task states and transitions
- Retries, delays and timeouts
Module 5: Operators Fundamentals
- What is an operator
- Action vs sensor vs transfer operators
- Built-in operators overview
- Choosing the right operator
Module 6: Commonly Used Operators
- BashOperator
- PythonOperator
- DummyOperator and EmptyOperator
- Branching operators
Module 7: Defining Dependencies
- Upstream and downstream
- Bitshift operators
- Complex dependency patterns
- Parallel and conditional execution
Module 8: Best Practices for DAG Design
- Readable DAG structure
- Idempotent task design
- Avoiding anti-patterns
- Organizing DAG code
Module 9: Debugging and Troubleshooting
- Understanding logs
- Common DAG errors
- Task failure analysis
- Testing DAGs locally
Module 10: Preparing for Advanced Airflow Usage
- Dynamic DAGs overview
- Custom operators introduction
- Sensors and event-driven pipelines
- Next steps in Airflow mastery