Visão Geral
Este curso aborda a implantação, operação e gerenciamento de Large Language Models (LLMs) Open Source em ambientes corporativos. O participante aprenderá a selecionar modelos adequados, preparar infraestrutura, realizar otimizações de desempenho, implementar arquiteturas de inferência escaláveis e operar soluções de IA Generativa utilizando modelos de código aberto. O curso explora tecnologias como Llama, Mistral, Qwen, Gemma, DeepSeek, vLLM, Ollama, Hugging Face, Kubernetes e plataformas de inferência modernas, com foco em ambientes de produção.
Conteúdo Programatico
Module 1: Introduction to Open Source LLMs
- Evolution of open-source AI
- Open-source versus proprietary models
- Enterprise adoption drivers
- Open-source AI ecosystem overview
- Licensing considerations
- Model selection strategies
Module 2: Overview of Modern Open Source LLMs
- Llama family models
- Mistral and Mixtral models
- Qwen model ecosystem
- Gemma models
- DeepSeek models
- Emerging open-source models
Module 3: Infrastructure Fundamentals for LLM Deployment
- Compute requirements
- GPU architectures and selection
- CPU-based inference considerations
- Memory planning strategies
- Storage requirements
- Networking fundamentals
Module 4: Model Acquisition and Management
- Hugging Face ecosystem
- Model repositories
- Model versioning
- Artifact management
- Secure model distribution
- Enterprise model governance
Module 5: Local and Single-Node Deployments
- Ollama deployment architecture
- LM Studio environments
- Local inference workflows
- Quantized model execution
- Performance tuning
- Resource optimization
Module 6: Production Inference Platforms
- vLLM architecture
- Text Generation Inference (TGI)
- SGLang fundamentals
- High-performance serving frameworks
- Throughput optimization
- Latency management
Module 7: Containerization and Kubernetes Deployment
- Containerizing LLM workloads
- Docker best practices
- Kubernetes architecture
- GPU scheduling
- Scaling strategies
- High-availability deployments
Module 8: Performance Optimization and Quantization
- Model quantization strategies
- GPTQ and AWQ implementations
- Memory optimization
- Throughput tuning
- Cost-performance trade-offs
- Hardware acceleration techniques
Module 9: Security and Governance
- Secure model deployment
- Identity and access management
- API security controls
- Data privacy considerations
- AI governance requirements
- Compliance and auditability
Module 10: Monitoring, Observability and LLMOps
- LLM observability fundamentals
- Metrics collection
- Performance monitoring
- Log management
- Capacity planning
- Operational excellence practices
Module 11: Enterprise Integration Architectures
- API gateway integration
- RAG integration patterns
- Agent architectures
- Multi-model routing
- Hybrid AI environments
- Enterprise architecture patterns
Module 12: Open Source LLM Deployment Workshop
- Ollama deployment laboratory
- vLLM production deployment exercises
- Kubernetes deployment projects
- Quantization and optimization activities
- Monitoring and governance implementation
- Final enterprise open-source LLM deployment project