Visão Geral
Este curso oferece uma visão abrangente sobre a plataforma Datadog aplicada à observabilidade moderna, cobrindo profundamente métricas, logs, APM e tracing distribuído. O aluno aprenderá a instrumentar aplicações, configurar dashboards, criar alertas inteligentes, monitorar serviços complexos e interpretar dados de desempenho de ponta a ponta. O curso combina teoria e prática em laboratório para garantir domínio completo da plataforma em ambientes reais.
Conteúdo Programatico
1. Introduction to Datadog Observability
- What is Observability
- Core pillars: Metrics, Logs, APM, and Tracing
- Overview of Datadog platform and components
- Datadog Agent architecture
- Key concepts: tags, indexes, scopes, monitors
2. Datadog Metrics
- Understanding metrics, gauges, counters, histograms
- Installing and configuring the Datadog Agent
- Collecting system and application metrics
- Host maps and infrastructure overview
- Custom metrics: creating and sending metrics
- Metric tagging and enrichment
- Building real-time dashboards
- Metric-based alerting and anomaly detection
3. Datadog Logs Management
- Log collection pipeline architecture
- Enabling log collection on hosts and containers
- Parsing logs with pipelines, processors, and filters
- Log indexes and retention policies
- Searching, filtering, and analyzing logs
- Creating log-based metrics
- Log patterns, facets, and dashboards
- Error tracking and alerting from logs
4. Datadog APM (Application Performance Monitoring)
- Understanding APM concepts
- Installing APM instrumentation (Java, Python, Node.js, Go, PHP, .NET)
- Service map and service visualization
- Monitoring requests, latency, errors, and throughput
- Profiling applications with Datadog Profiler
- APM dashboards and analytics
- Detecting performance bottlenecks
5. Distributed Tracing
- Concepts of distributed tracing
- Trace propagation across microservices
- Spans, traces, resources, and services
- Instrumenting distributed applications
- Using tracing to identify errors and slowdowns
- End-to-end transaction troubleshooting
- Trace search and analyze
- Integrating tracing with logs and metrics
6. Integrations and Ecosystem
- AWS, GCP, Azure integrations
- Docker and Kubernetes monitoring
- Database monitoring (MySQL, Postgres, MongoDB, Redis, etc.)
- Serverless monitoring
- Load balancers, API gateways, web servers
- Network performance monitoring (NPM)
7. Dashboards and Visualization
- Designing effective observability dashboards
- Widgets, timeseries, heatmaps, query values
- Using templates and variables
- Building team-specific dashboards
- Sharing and exporting dashboards
8. Alerts, Monitors, and Incident Management
- Monitor types (metric, log, APM, SLO, anomaly, outlier, forecast)
- Creating actionable alerting strategies
- SLOs (Service Level Objectives) and Error Budgets
- Integrating Datadog with incident management tools
- Notifications and escalation policies
9. Security, Governance, and Best Practices
- RBAC and access management
- Data retention strategy
- Cost optimization in Datadog
- Tagging standards for observability
- Best practices for large-scale environments
10. Hands-on Labs and Final Project
- Install and configure Datadog Agent
- Collect metrics, logs, and traces from a sample microservices app
- Build a complete observability dashboard
- Configure monitors and SLOs
- Final observability project: diagnose real-world issues