Curso Python for Data Analysis with Pandas and NumPy

  • DevOps | CI | CD | Kubernetes | Web3

Curso Python for Data Analysis with Pandas and NumPy

24 horas
Visão Geral

O curso Python for Data Analysis with Pandas and NumPy oferece uma base sólida para profissionais que desejam trabalhar com análise e manipulação de dados utilizando Python. Por meio das bibliotecas NumPy e Pandas, os participantes aprenderão a realizar operações eficientes em grandes volumes de dados, limpeza, transformação, agregação e análise exploratória, preparando informações para relatórios e modelos de machine learning.

Objetivo

Após realizar este curso Python for Data Analysis with Pandas and NumPy, você será capaz de:

  • Manipular e analisar dados com eficiência usando Pandas e NumPy
  • Trabalhar com estruturas de dados como Series e DataFrames
  • Executar transformações e agregações complexas de dados
  • Tratar dados ausentes, duplicados e inconsistentes
  • Preparar dados para análise estatística e machine learning
Publico Alvo
  • Cientistas de dados iniciantes, analistas, desenvolvedores e estudantes que desejam dominar ferramentas de análise de dados com Python de forma prática e aplicada.
Pre-Requisitos
  • Conhecimento básico de Python
  • Noções de estatística e álgebra linear (desejável)
  • Familiaridade com Jupyter Notebook
Materiais
Inglês/Português + Exercícios + Lab Pratico
Conteúdo Programatico

Introduction to Data Analysis with Python

  1. Importance of data analysis in business and science
  2. Overview of Pandas and NumPy
  3. Setting up the environment (Anaconda, Jupyter Notebook)

NumPy Fundamentals

  1. Creating and manipulating NumPy arrays
  2. Array indexing, slicing, and reshaping
  3. Vectorized operations and broadcasting
  4. Mathematical and statistical functions in NumPy

Working with Random Data and Simulations

  1. Random number generation and sampling
  2. Simulating data for analysis
  3. Using NumPy for numerical computations

Introduction to Pandas

  1. Understanding Series and DataFrames
  2. Creating DataFrames from different sources (CSV, Excel, JSON, SQL)
  3. Indexing, selecting, and filtering data

Data Cleaning and Preparation

  1. Handling missing and duplicate data
  2. Data type conversion and renaming columns
  3. String operations and applying custom functions

Data Transformation and Aggregation

  1. Sorting, grouping, and aggregating data
  2. Pivot tables and cross-tabulations
  3. Merging and joining multiple DataFrames

Exploratory Data Analysis (EDA)

  1. Descriptive statistics and correlations
  2. Outlier detection and data visualization with Pandas
  3. Identifying data trends and anomalies

Working with Time Series Data

  1. Parsing and manipulating dates
  2. Resampling and frequency conversion
  3. Rolling statistics and time-based indexing

Input/Output Operations

  1. Reading and writing CSV, Excel, JSON, and SQL databases
  2. Working with APIs and external data sources
  3. Managing large datasets efficiently

Performance Optimization

  1. Vectorization and avoiding loops
  2. Memory usage optimization
  3. Profiling and benchmarking code

Integrating Pandas, NumPy, and Visualization Libraries

  1. Using Matplotlib and Seaborn for visual analytics
  2. Combining numeric and visual analysis
  3. Case study: From raw data to insights

Final Project

  1. Full end-to-end data analysis project using Pandas and NumPy
  2. Data cleaning, transformation, visualization, and reporting
TENHO INTERESSE

Cursos Relacionados

Curso Ansible Red Hat Basics Automation Technical Foundation

16 horas

Curso Terraform Deploying to Oracle Cloud Infrastructure

24 Horas

Curso Ansible Linux Automation with Ansible

24 horas

Ansible Overview of Ansible architecture

16h

Advanced Automation: Ansible Best Practices

32h