Visão Geral
Este curso avançado aprofunda o uso do Apache Flink SQL para construção de pipelines complexos de streaming, abordando janelas avançadas, joins temporais sofisticados, otimização de consultas, tuning de performance e integração com arquiteturas de dados em grande escala. É voltado para quem já domina o básico e deseja usar o Flink SQL em cenários reais de alta demanda e baixa latência.
Conteúdo Programatico
Module 1 – Advanced Flink SQL Internals
- Deep dive into the SQL Planner and Optimizer
- Understanding changelog modes and updates
- State management under the hood
- How Flink handles retractions
Module 2 – Complex Table Definitions
- Advanced connector configurations
- Advanced watermarking techniques
- Schema evolution: add, remove, alter columns
- Using custom formats (JSON, Avro, Debezium, Protobuf)
Module 3 – Advanced Time and Window Processing
- Session windows with advanced gap strategies
- Multi-window pipelines
- Window merging and splitting
- Handling late data with precision
Module 4 – Advanced Joins
- Interval joins
- Complex temporal table joins
- Stream–stream joins with large state
- Lookup joins with high throughput
- Multi-way joins and performance considerations
Module 5 – Aggregations and Pattern Processing
- Complex aggregations on dynamic tables
- Incremental aggregations
- Pattern Recognition with MATCH_RECOGNIZE
- Real-time anomaly detection using SQL
Module 6 – Query Optimization and Performance Tuning
- Understanding query plans and EXPLAIN
- Memory and state optimization strategies
- Tuning parallelism, slots, and resources
- Reducing backpressure
- Avoiding hotspots in streaming SQL
Module 7 – Building Production-Ready Pipelines
- Designing end-to-end pipelines using Kafka + Flink SQL
- Multi-sink pipelines
- Error handling, retries, and exactly-once semantics
- Logging, observability, and metrics
Module 8 – Real Use Cases
- Real-time fraud detection pipeline
- Clickstream analytics at scale
- CDC processing using Flink SQL
- IoT event processing with high cardinality
Module 9 – Best Practices
- Naming conventions and governance
- Managing catalogs in production
- Handling schema drift
- Lessons learned from large-scale deployments