Visão Geral
O curso Service Mesh Observability apresenta uma visão completa e prática sobre como implementar e monitorar uma malha de serviços (Service Mesh) com foco total em observabilidade, desempenho e confiabilidade de aplicações distribuídas.
O participante aprenderá a configurar e operar soluções como Istio, Linkerd e Consul, aplicando métricas, logs e rastreamento distribuído para obter visibilidade de ponta a ponta em sistemas baseados em microsserviços.
Durante o curso, serão realizadas práticas laboratoriais que simulam cenários reais de produção, incidentes e troubleshooting de aplicações em ambientes Kubernetes.
Conteúdo Programatico
Module 1: Introduction to Service Mesh
- Understanding the need for Service Mesh in microservices architecture
- Core concepts: sidecar proxies, control plane, and data plane
- Comparison of popular Service Mesh technologies: Istio, Linkerd, and Consul
- Key features: traffic management, security, and observability
Module 2: Observability Foundations
- What is observability: metrics, logs, and traces
- Monitoring vs observability vs telemetry
- Golden signals (latency, traffic, errors, saturation)
- Integrating observability into modern DevOps and SRE practices
Module 3: Setting Up the Lab Environment
- Preparing a Kubernetes cluster (Minikube, Kind, or EKS)
- Installing Prometheus and Grafana for monitoring
- Setting up Jaeger or Tempo for distributed tracing
- Configuring Fluent Bit or Loki for centralized logging
Module 4: Deploying Istio for Observability
- Installing Istio using Istioctl or Helm
- Enabling telemetry and tracing in Istio
- Working with Envoy sidecars and control plane (Istiod)
- Visualizing traffic and metrics through Kiali and Grafana
Module 5: Linkerd and Lightweight Observability
- Installing and configuring Linkerd
- Observing requests and latency with Linkerd Viz
- Comparing observability capabilities between Istio and Linkerd
- Using service profiles and routes for monitoring traffic patterns
Module 6: Consul Service Mesh Overview
- Setting up Consul Connect on Kubernetes
- Observing service health and dependencies
- Integrating Consul metrics with Prometheus
- Logs and tracing setup for Consul-based mesh
Module 7: Distributed Tracing Deep Dive
- Tracing microservice requests end-to-end
- Configuring Jaeger, Zipkin, and OpenTelemetry agents
- Analyzing spans, dependencies, and latency bottlenecks
- Using tracing data for root cause analysis
Module 8: Advanced Observability Use Cases
- Service-level objectives (SLOs) and error budgets
- Alerting strategies with Prometheus and Alertmanager
- Detecting anomalies and performance degradation
- Troubleshooting with logs, traces, and metrics correlation
Module 9: Best Practices and Optimization
- Cost-effective observability and telemetry strategies
- Securing observability data and access control
- Scaling telemetry collection and retention
- Comparing performance overhead between mesh solutions
Module 10: Hands-On Labs and Capstone Project
- Implementing end-to-end observability with Istio and Linkerd
- Monitoring service dependencies and latency maps
- Real-world incident simulation and resolution
- Capstone project: full observability pipeline in a microservices environment