Curso Datadog Observability (Metrics + Logs + APM + Tracing)

  • DevOps | CI | CD | Kubernetes | Web3

Curso Datadog Observability (Metrics + Logs + APM + Tracing)

24 horas
Visão Geral

Este curso oferece uma visão abrangente sobre a plataforma Datadog aplicada à observabilidade moderna, cobrindo profundamente métricas, logs, APM e tracing distribuído. O aluno aprenderá a instrumentar aplicações, configurar dashboards, criar alertas inteligentes, monitorar serviços complexos e interpretar dados de desempenho de ponta a ponta. O curso combina teoria e prática em laboratório para garantir domínio completo da plataforma em ambientes reais.

Objetivo

Após realizar este curso Datadog Observability (Metrics + Logs + APM + Tracing), você será capaz de:

  • Entender e aplicar o conceito de Observabilidade moderna
  • Configurar e administrar Datadog em ambientes reais
  • Coletar, analisar e visualizar métricas, logs e traces
  • Instrumentar aplicações para monitoramento avançado
  • Criar dashboards inteligentes e alertas eficientes
  • Utilizar APM e Distributed Tracing para diagnosticar problemas complexos
  • Integrar Datadog com serviços cloud, containers e Kubernetes
  • Implementar boas práticas de monitoramento e observabilidade
Publico Alvo
  • Engenheiros DevOps
  • SREs (Site Reliability Engineers)
  • Desenvolvedores Backend e Full Stack
  • Analistas de Observabilidade e Performance
  • Arquitetos de Sistemas
  • Profissionais de TI que desejam dominar Datadog para monitoramento avançado

 

Pre-Requisitos
  • Conhecimento básico de Linux
  • Noções de Cloud (AWS, GCP ou Azure)
  • Conceitos básicos de monitoramento e logs
  • Conhecimento básico de aplicações web ou microservices
Materiais
Ingles/Portugues
Conteúdo Programatico

1. Introduction to Datadog Observability

  1. What is Observability
  2. Core pillars: Metrics, Logs, APM, and Tracing
  3. Overview of Datadog platform and components
  4. Datadog Agent architecture
  5. Key concepts: tags, indexes, scopes, monitors

2. Datadog Metrics

  1. Understanding metrics, gauges, counters, histograms
  2. Installing and configuring the Datadog Agent
  3. Collecting system and application metrics
  4. Host maps and infrastructure overview
  5. Custom metrics: creating and sending metrics
  6. Metric tagging and enrichment
  7. Building real-time dashboards
  8. Metric-based alerting and anomaly detection

3. Datadog Logs Management

  1. Log collection pipeline architecture
  2. Enabling log collection on hosts and containers
  3. Parsing logs with pipelines, processors, and filters
  4. Log indexes and retention policies
  5. Searching, filtering, and analyzing logs
  6. Creating log-based metrics
  7. Log patterns, facets, and dashboards
  8. Error tracking and alerting from logs

4. Datadog APM (Application Performance Monitoring)

  1. Understanding APM concepts
  2. Installing APM instrumentation (Java, Python, Node.js, Go, PHP, .NET)
  3. Service map and service visualization
  4. Monitoring requests, latency, errors, and throughput
  5. Profiling applications with Datadog Profiler
  6. APM dashboards and analytics
  7. Detecting performance bottlenecks

5. Distributed Tracing

  1. Concepts of distributed tracing
  2. Trace propagation across microservices
  3. Spans, traces, resources, and services
  4. Instrumenting distributed applications
  5. Using tracing to identify errors and slowdowns
  6. End-to-end transaction troubleshooting
  7. Trace search and analyze
  8. Integrating tracing with logs and metrics

6. Integrations and Ecosystem

  1. AWS, GCP, Azure integrations
  2. Docker and Kubernetes monitoring
  3. Database monitoring (MySQL, Postgres, MongoDB, Redis, etc.)
  4. Serverless monitoring
  5. Load balancers, API gateways, web servers
  6. Network performance monitoring (NPM)

7. Dashboards and Visualization

  1. Designing effective observability dashboards
  2. Widgets, timeseries, heatmaps, query values
  3. Using templates and variables
  4. Building team-specific dashboards
  5. Sharing and exporting dashboards

8. Alerts, Monitors, and Incident Management

  1. Monitor types (metric, log, APM, SLO, anomaly, outlier, forecast)
  2. Creating actionable alerting strategies
  3. SLOs (Service Level Objectives) and Error Budgets
  4. Integrating Datadog with incident management tools
  5. Notifications and escalation policies

9. Security, Governance, and Best Practices

  1. RBAC and access management
  2. Data retention strategy
  3. Cost optimization in Datadog
  4. Tagging standards for observability
  5. Best practices for large-scale environments

10. Hands-on Labs and Final Project

  1. Install and configure Datadog Agent
  2. Collect metrics, logs, and traces from a sample microservices app
  3. Build a complete observability dashboard
  4. Configure monitors and SLOs
  5. Final observability project: diagnose real-world issues
TENHO INTERESSE

Cursos Relacionados

Curso Ansible Red Hat Basics Automation Technical Foundation

16 horas

Curso Terraform Deploying to Oracle Cloud Infrastructure

24 Horas

Curso Ansible Linux Automation with Ansible

24 horas

Ansible Overview of Ansible architecture

16h

Advanced Automation: Ansible Best Practices

32h