Visão Geral
O curso Apache Flink SQL Fundamentals apresenta os conceitos essenciais para trabalhar com processamento de dados em streaming e batch utilizando o módulo SQL do Apache Flink. O participante aprenderá a criar consultas SQL para transformar, enriquecer, analisar e manipular dados em tempo real, integrando com sistemas como Apache Kafka, bancos de dados e data lakes.
O curso combina teoria e prática em laboratório individual, proporcionando uma base sólida para construir pipelines de dados modernos, escaláveis e orientados a eventos.
Conteúdo Programatico
Module 1 – Introduction to Apache Flink SQL
- What is Flink SQL
- Streaming vs Batch processing
- Flink’s Table & SQL API
- Key concepts: Dynamic Tables, Continuous Queries, Changelog Streams
Module 2 – Flink SQL Architecture
- How Flink SQL works internally
- SQL Gateway and SQL Client
- Planner, Optimizer, and Execution Model
- Overview of Catalogs, Tables, and Schemas
Module 3 – Working with Tables
- Creating tables (DDL)
- Managed vs External tables
- Table formats: JSON, Avro, CSV, Parquet
- Table connectors overview
Module 4 – Kafka Integration
- Creating Kafka source/sink tables
- Defining watermarks and event time
- Reading/writing streams via SQL
- Hands-on: Kafka → Flink SQL → Kafka pipeline
Module 5 – Querying Streaming Data
- Basic SELECT queries
- Filtering and projections
- Calculated columns
- Handling late events
Module 6 – Windowing Operations
- Tumbling, Hopping, Sliding, and Session windows
- Window aggregates and metrics
- Window functions best practices
- Hands-on: Real-time KPI calculations
Module 7 – Joins in Streaming SQL
- Inner, left/right, full joins
- Temporal joins
- Lookup joins with JDBC tables
- Streaming enrichment patterns
Module 8 – Aggregations & Analytics
- Group-by processing
- Incremental and complete aggregates
- Top-N and Ranking queries
- Complex event transformations
Module 9 – Working with Time in Flink SQL
- Event Time vs Processing Time
- Watermarks
- Out-of-order event handling
Module 10 – Deploying SQL Pipelines
- Submitting SQL jobs to Flink clusters
- Using SQL Gateway for deployment
- Monitoring jobs
- Debugging and troubleshooting
Module 11 – Real-Time Use Cases
- ETL and data enrichment pipelines
- Fraud detection and real-time alerts
- Live dashboards and analytics
- IoT and event monitoring