Visão Geral
Curso Transformers e Modelos Fundacionais. Este curso apresenta uma visão aprofundada e estruturada sobre a arquitetura Transformers e o conceito de Modelos Fundacionais, que revolucionaram o campo do Deep Learning e da Inteligência Artificial moderna. O curso cobre desde os fundamentos matemáticos e computacionais da atenção até arquiteturas em larga escala, estratégias de pré-treinamento, fine-tuning e aplicações em linguagem natural, visão computacional e sistemas multimodais. O foco está na compreensão arquitetural, escalabilidade e uso prático em ambientes de pesquisa e produção.
Conteúdo Programatico
Module 1: Introduction to Transformers
- Limitations of RNNs and CNNs for sequence modeling
- Evolution towards attention-based models
- Transformer architecture overview
- Encoder, decoder and encoder-decoder models
Module 2: Attention Mechanisms
- Attention intuition and mathematical formulation
- Query, key and value representations
- Scaled dot-product attention
- Masked attention and causal attention
Module 3: Multi-Head Attention and Positional Encoding
- Multi-head attention architecture
- Linear projections and dimensionality
- Positional encoding strategies
- Relative and rotary positional embeddings
Module 4: Transformer Blocks and Training
- Feed-forward networks inside transformers
- Residual connections and layer normalization
- Backpropagation through attention
- Computational complexity and scaling laws
Module 5: Pretraining Strategies
- Language modeling objectives
- Masked language models
- Autoregressive models
- Contrastive and self-supervised learning
Module 6: Foundation Models
- Definition and characteristics of foundation models
- Scaling data, parameters and compute
- Emergent capabilities
- Ethical and societal considerations
Module 7: Fine-Tuning and Adaptation
- Full fine-tuning
- Parameter-efficient fine-tuning (PEFT)
- Prompt tuning and instruction tuning
- Reinforcement learning from human feedback
Module 8: Multimodal and Advanced Applications
- Vision transformers
- Text-to-image and multimodal models
- Retrieval-augmented generation
- Deployment considerations for large models