Visão Geral
Este curso aborda estratégias de alta disponibilidade na entrega de métricas utilizando outputs do Telegraf, ensinando como projetar pipelines resilientes, tolerantes a falhas e preparados para ambientes críticos. O foco está em evitar perda de dados, isolar falhas e garantir continuidade operacional mesmo com indisponibilidade parcial de destinos.
Conteúdo Programatico
Module 1 – High Availability Concepts for Data Pipelines
- What high availability means for metrics
- Availability vs durability vs consistency
- Failure scenarios in output delivery
- Traditional HA design principles
Module 2 – Telegraf Output Reliability Model
- Output plugin execution lifecycle
- Buffering and retry mechanisms
- Flush behavior under failure
- Delivery guarantees and limitations
Module 3 – Redundant Output Architectures
- Active-active output patterns
- Active-passive configurations
- Multi-endpoint outputs
- Trade-offs and design decisions
Module 4 – Failover and Failure Isolation
- Detecting output failures
- Preventing cascading failures
- Output isolation strategies
- Partial delivery handling
Module 5 – Buffering, Queues and Backpressure
- Memory vs disk buffering
- Handling slow or unavailable destinations
- Backpressure mitigation
- Data loss prevention techniques
Module 6 – High Availability with Common Outputs
- InfluxDB HA delivery patterns
- Kafka cluster-based resilience
- MQTT broker redundancy
- HTTP endpoint failover
Module 7 – Monitoring and Testing HA Outputs
- Observing output health
- Telegraf internal metrics
- Failure simulation and chaos testing
- Alerting strategies
Module 8 – Production-Grade HA Design Scenarios
- Observability pipelines
- Mission-critical infrastructure monitoring
- Industrial and IoT environments
- Best practices and common pitfalls