Visão Geral
Este curso ensina como construir pipelines completos de análise em tempo real utilizando Apache Flink SQL integrado ao Apache Kafka. Você aprenderá a consumir dados de streams, aplicar janelas e agregações avançadas, detectar padrões, realizar análises contínuas e enviar resultados para sistemas analíticos. É uma formação prática voltada para quem deseja implementar soluções modernas de Real-Time Analytics em larga escala.
Conteúdo Programatico
Module 1 – Introduction to Real-Time Analytics
- What is real-time analytics?
- Batch vs streaming analytical models
- When to use Flink SQL + Kafka
Module 2 – Kafka Fundamentals for Streaming Analytics
- Kafka topics, partitions, retention
- Producers, consumers, consumer groups
- Designing event schemas for analytics
- High-throughput ingestion strategies
Module 3 – Flink SQL Essentials for Analytics
- Dynamic tables and changelog streams
- Defining Kafka sources using SQL DDL
- Understanding event time and watermarks
Module 4 – Windowed Analytics
- Tumbling, hopping, and cumulative windows
- Session windows for user behavior analytics
- Windowed aggregations at scale
- Working with late events and watermarks
Module 5 – Advanced Analytical Queries
- Trend detection queries
- Time-series analytics
- Pattern detection with MATCH_RECOGNIZE
- Funnel analysis and clickstream pipelines
Module 6 – Joining Streaming Data
- Stream–stream joins
- Temporal table joins
- Lookup joins for reference enrichment
- Optimizing join performance
Module 7 – Building End-to-End Pipelines
- Kafka → Flink SQL → Kafka
- Kafka → Flink SQL → Elasticsearch / OLAP systems
- Creating dashboards with BI tools (Grafana, Superset)
- Error handling and exactly-once analytics
Module 8 – Performance and Optimization
- Managing backpressure
- Choosing parallelism and resources
- Connector tuning for high-volume analytics
- Memory and state management
Module 9 – Real-World Use Cases
- Real-time metrics pipeline
- Fraud detection analytics
- Monitoring IoT sensor streams
- Marketing and user behavior analytics
Module 10 – Best Practices
- Schema governance for analytics
- Handling schema evolution
- Data quality in streaming
- Operational considerations for production