Curso Apache Flink Apache Kafka Real-Time Pipelines

  • DevOps | CI | CD | Kubernetes | Web3

Curso Apache Flink Apache Kafka Real-Time Pipelines

32 horas Curso Pratico
Visão Geral

Este curso foi desenvolvido para profissionais que desejam dominar a criação de pipelines de dados em tempo real utilizando Apache Flink e Apache Kafka. O aluno aprenderá conceitos fundamentais e avançados de stream processing, integração, padrões arquiteturais profissionais, otimização, garantia de entrega, e implementação de pipelines completos ponta a ponta.
Ao final, será capaz de construir sistemas escaláveis, tolerantes a falhas, com baixa latência e altamente resilientes.

Objetivo

Após realizar este curso Apache Flink + Apache Kafka (Real-Time Pipelines), você será capaz de:

  • Conectar e integrar Apache Flink com Apache Kafka usando os conectores modernos.
  • Criar pipelines de streaming com alto throughput e baixa latência.
  • Implementar janelas, estados, event-time e transformações avançadas.
  • Desenvolver pipelines com garantia exactly-once.
  • Realizar tuning, paralelismo e mitigação de backpressure.
  • Criar pipelines enterprise com DLQ, schema evolution e monitoramento.
  • Implantar pipelines completos em ambientes distribuídos como Kubernetes e Docker.
Publico Alvo
  • Engenheiros de Dados
  • Desenvolvedores Backend
  • Engenheiros de Software
  • Engenheiros de Streaming e Observabilidade
  • Analistas de Dados e Cientistas de Dados que trabalham com real-time
Pre-Requisitos
  • Fundamentos de Apache Kafka
  • Fundamentos de Apache Flink
  • Conhecimentos básicos de Java, Scala ou Python
  • Noções de arquitetura distribuída
Materiais
Inglês/Português + Exercícios + Lab Pratico
Conteúdo Programatico

Module 1 – Kafka Architecture Essentials (4h)

  1. Core Kafka concepts
  2. Topics, partitions, replication
  3. Producers and consumers
  4. Consumer groups and rebalancing
  5. Offset management
  6. Kafka delivery semantics: at-most-once, at-least-once, exactly-once
  7. Schema Registry basics (Avro / JSON / Protobuf)
  8. Hands-on: Creating topics, producing and consuming events

Module 2 – Flink Architecture & Runtime Review (3h)

  1. Flink cluster architecture overview
  2. JobManager, TaskManager, task slots
  3. Checkpoints, savepoints, barriers
  4. Flink event-time fundamentals
  5. DataStream vs Table API
  6. Hands-on: Running Flink locally and submitting jobs

Module 3 – Flink + Kafka Integration (6h)

3.1 Modern Kafka Source & Sink

  1. KafkaSource (new unified API)
  2. Offset strategies (earliest, latest, committed)
  3. KafkaSink with transactions
  4. Delivery semantics with Kafka + Flink

3.2 Working with Schema Registry

  1. Avro schema evolution rules
  2. Enforcing compatibility
  3. Integrating Flink with Confluent Schema Registry

3.3 Hands-on

  1. Build a streaming pipeline Flink → Kafka → Flink
  2. Testing exactly-once
  3. Handling consumer lag

Module 4 – Stream Transformations & Event-Time Processing (5h)

  1. Stateless transformations (map, filter, flatMap)
  2. Keyed stream patterns
  3. Event time vs processing time
  4. Watermarks and late events
  5. Window types:
  6. Tumbling
  7. Sliding
  8. Session
  9. Global windows
  10. Custom window functions
  11. Hands-on: Real-time aggregations with event-time

Module 5 – Stateful Stream Processing (4h)

  1. Keyed state and operator state
  2. ValueState, ListState, MapState
  3. Timers and state expiration
  4. RocksDB state backend deep usage
  5. Checkpoints and fault tolerance
  6. Hands-on: Building a stateful fraud-detection pipeline

Module 6 – Streaming Patterns with Kafka + Flink (4h)

  1. Stream enrichment with broadcast state
  2. Multiple Kafka topics processing
  3. Side outputs / DLQ
  4. Reprocessing and backfill strategies
  5. CDC (Change Data Capture) with Flink + Debezium + Kafka
  6. End-to-end patterns:
  7. Lambda architecture
  8. Kappa architecture
  9. Stateful event routing
  10. Hands-on: Pipeline with side outputs and DLQ

Module 7 – Integration with External Systems (3h)

  1. JDBC Sink (PostgreSQL / MySQL)
  2. Connecting to ElasticSearch
  3. Exporting to S3 / MinIO
  4. Schema on read vs schema on write
  5. Hands-on: Kafka → Flink → PostgreSQL pipeline

Module 8 – Performance Optimization & Backpressure (2h)

  1. Understanding backpressure
  2. Monitoring throughput and latency
  3. Operator chaining
  4. Parallelism tuning
  5. Slot allocations
  6. RocksDB tuning for large state
  7. Hands-on: Fixing a pipeline with backpressure

Module 9 – Deployment & Observability (3h)

  1. Flink deployments:
  2. Standalone
  3. Docker
  4. Kubernetes
  5. Flink Native Kubernetes mode
  6. Kafka in distributed environments
  7. Observability:
  8. Flink Web UI
  9. Metrics
  10. Logs
  11. Prometheus + Grafana
  12. Hands-on: Deploying a Kafka + Flink pipeline in Docker Compose

Module 10 – Final Project (Training Capstone) (2h)

  1. Build an enterprise-grade streaming pipeline using:
  2. KafkaSource + Schema Registry
  3. Stateful Flink transformations
  4. Event-time windowing
  5. DLQ + retries
  6. Exactly-once semantics
  7. Sink to PostgreSQL / S3
  8. Monitoring through Flink UI
  9. Entrega final: working real-time pipeline.
TENHO INTERESSE

Cursos Relacionados

Curso Ansible Red Hat Basics Automation Technical Foundation

16 horas

Curso Terraform Deploying to Oracle Cloud Infrastructure

24 Horas

Curso Ansible Linux Automation with Ansible

24 horas

Ansible Overview of Ansible architecture

16h

Advanced Automation: Ansible Best Practices

32h