Visão Geral
Este curso aborda o uso do Telegraf como agente central de observabilidade, mostrando como coletar, processar e entregar métricas de forma confiável para apoiar práticas de Observabilidade, DevOps e SRE. O foco está em transformar dados brutos em sinais úteis para entender, operar e evoluir sistemas em produção.
Conteúdo Programatico
Module 1 – Observability Foundations
- Monitoring vs observability
- Telemetry data types
- Signals and system understanding
- Observability mindset
Module 2 – Telegraf’s Role in Observability
- Telegraf as a telemetry agent
- Metrics collection architecture
- Plugin ecosystem overview
- Agent deployment patterns
Module 3 – Collecting High-Quality Metrics
- Infrastructure metrics
- Application metrics
- System and service signals
- Metric selection strategies
Module 4 – Processing and Enriching Telemetry
- Filtering noise
- Normalizing metrics
- Tag strategy for observability
- Aggregation for signal clarity
Module 5 – Delivering Metrics to Observability Stacks
- InfluxDB integration
- Prometheus remote write
- Kafka and streaming outputs
- Multi-destination delivery
Module 6 – Reliability and Performance
- High-frequency data handling
- Buffering and retry strategies
- Preventing data loss
- Performance tuning
Module 7 – Operating Telegraf in Production
- Configuration management
- Security and access control
- Monitoring Telegraf itself
- Troubleshooting pipelines
Module 8 – Observability Use Cases and Best Practices
- SRE-focused observability
- Incident detection and analysis
- Capacity planning
- Anti-patterns and lessons learned