Visão Geral
Este curso aborda o desenvolvimento de pipelines e aplicações de processamento de dados em tempo real utilizando o Apache Flink. Você aprenderá a lidar com eventos contínuos, janelas, estados, tempos de processamento, além de integrar o Flink com sistemas modernos de mensageria e armazenamento. O foco é construir aplicações robustas, rápidas e escaláveis para cenários reais.
Conteúdo Programatico
Module 1 – Introduction to Real-Time Processing
- Understanding real-time and near-real-time data
- Event-driven architecture
- Why Apache Flink for stream processing?
- Flink ecosystem overview
Module 2 – Flink Core Concepts
- Streams, events, and operators
- Time semantics (processing, ingestion, event time)
- Watermarks and lateness
- Stateful and stateless operations
Module 3 – Setting Up Flink for Streaming
- Local installation
- Project setup using Maven or Gradle
- Working with Flink DataStream API
Module 4 – Transformations and Operators
- Map, flatMap, filter
- Keyed streams
- Rich functions
- Custom operators
Module 5 – Windowing & Event Time
- Tumbling, sliding, and session windows
- Window assigners and triggers
- Handling out-of-order events
- Late data strategies
Module 6 – State Management
- Keyed state fundamentals
- Operator state
- State backends (RocksDB, filesystem, memory)
- State TTL and cleanup policies
Module 7 – Checkpoints & Fault Tolerance
- Consistency models
- Checkpointing mechanism
- Savepoints for upgrades
- Recovery workflows
Module 8 – Integration with Streaming Platforms
- Kafka as source and sink
- Schema registry and serialization formats
- Integrating with databases and cloud storage
- Producing enriched and aggregated streams
Module 9 – Real-Time Patterns & Use Cases
- Stream enrichment
- Event routing
- Real-time fraud detection
- Real-time analytics dashboards
Module 10 – Monitoring & Optimization
- Flink UI and metrics
- Identifying backpressure
- Tuning parallelism and resources
- Performance optimization best practices
Module 11 – Final Project
- Designing a complete real-time processing pipeline
- Ingesting data from Kafka
- Applying transformations, windows and states
- Exporting results to downstream systems
- Deploying and validating the pipeline