Visão Geral
Este curso aborda, de forma aprofundada e prática, o uso de Transformers no contexto de Data Science aplicada ao NLP (Processamento de Linguagem Natural). O participante aprenderá desde os fundamentos da arquitetura Transformer até a implementação de soluções reais com modelos pré-treinados como BERT e GPT, utilizando ferramentas modernas como Hugging Face Transformers, PyTorch e TensorFlow.
O curso é voltado para aplicação prática em projetos de ciência de dados, com foco em construção, otimização e deploy de modelos de linguagem.
Conteúdo Programatico
Module 1: Foundations of NLP and Data Science
- Overview of NLP in Data Science
- Text data lifecycle
- Data collection and preprocessing
- Challenges in NLP
Module 2: Evolution to Transformers
- Limitations of traditional NLP models
- RNN, LSTM, and Seq2Seq overview
- Introduction to attention mechanisms
- Emergence of Transformers
Module 3: Attention Mechanism Deep Dive
- Scaled dot-product attention
- Multi-head attention
- Positional encoding
- Attention visualization
Module 4: Transformer Architecture
- Encoder and decoder structure
- Residual connections and normalization
- Training dynamics
- Model scalability
Module 5: Pretrained Models Ecosystem
- BERT and its variants
- GPT models
- RoBERTa, DistilBERT, T5
- Model selection strategies
Module 6: Hugging Face in Practice
- Using Transformers library
- Tokenizers and datasets
- Pipelines API
- Model hub exploration
Module 7: Fine-Tuning Techniques
- Transfer learning concepts
- Fine-tuning for classification
- Fine-tuning for NER
- Fine-tuning for QA
Module 8: NLP Applications with Transformers
- Sentiment analysis
- Named Entity Recognition
- Question answering systems
- Text generation
Module 9: Optimization and Performance
- Model compression
- Quantization and pruning
- Efficient inference
- Cost optimization
Module 10: Deployment and MLOps
- Model serving via APIs
- Containerization with Docker
- CI/CD pipelines for ML
- Monitoring and logging
Module 11: Advanced Topics
- Retrieval-Augmented Generation (RAG)
- Multi-modal transformers
- RLHF (Reinforcement Learning with Human Feedback)
- Custom architectures
Module 12: Final Project
- End-to-end NLP solution
- Business problem definition
- Model development and evaluation
- Deployment and presentation