Curso Modeling Data for Inference

  • Big Data

Curso Modeling Data for Inference

32 horas
Visão Geral

Este Curso Modeling Data for Inference: Modelagem de dados para inferência ensina os participantes como usar Python para realizar inferência causal em dados observacionais. Os participantes aprendem como trabalhar com modelos inferenciais, dados faltantes e design experimental.

Objetivo

Após realizar este Curso Modeling Data for Inference você será capaz de:

  • Execute inferência causal em dados observacionais usando Python
  • Execute e interprete testes de hipótese nula em Python
  • Implementar modelos lineares generalizados em modelos estatísticos
  • Entenda os dados ausentes
  • Imputar dados ausentes
  • Gere cálculos de potência precisos
  • Implementar métodos não paramétricos para testar hipóteses.
  • Use estruturas de inferência causal para identificar efeitos causais a partir de dados observacionais
Pre-Requisitos

Os participantes devem ter uma base sólida em programação Python para análises descritivas.

Materiais
Inglês/Português/Lab Prático
Conteúdo Programatico

Introduction

GLMs with Python using Stats Models

  1. Applying Statistical Models for Analysis in Python: The A/B test
  2. Explanation of statsmodels library of functions
  3. Inferential and descriptive statistics refresher
  4. Implementing A/B tests

Modeling Continuous Data (Linear models)

  1. Formulation of the simple linear model
  2. Application of the intercept only, null model
  3. Binary predictor
  4. Interpreting results
  5. Categorical predictor
  6. Continuous predictor
  7. Polynomial expansions
  8. Multiple linear regression
  9. Spline models
  10. Interaction terms
  11. Picking the “best” model
  12. Discussion of confounding, interaction terms, and model building approaches

Modeling Binary Data (Logistic models)

  1. Discussion of the generalized linear model
  2. The Logit link function
  3. Binomial distribution
  4. Intercept only model
  5. Back transformation of coefficients
  6. Simple predictor
  7. Multiple predictors
  8. Odds ratio interpretations
  9. Generating a scoring data set
  10. Predicting from the model with new data
  11. Modeling Count Outcomes
  12. How are count outcomes different?
  13. Poisson models
  14. Over dispersed modeling options
  15. Log link functions
  16. Using offsets to model rates / uneven follow-up

Power Analyses/Study Design

  1. Understanding and estimating statistical power
  2. Type 1 and type 2 errors
  3. Using existing power estimators
  4. Simulating power through the data-generating process

Non-Parametric Analysis Methods

  1. Using bootstrapping/permutation tests
  2. Bootstrapping versus depending on asymptotic behavior to estimate confidence intervals
  3. How different/stable are my results?
  4. resampling a data set
  5. bias-corrected bootstrap interval
  6. Extending the bootstrap function to calculate more statistics
  7. Permutation tests for p-values

Missing data

  1. Quantifying
  2. Visualizing missing data
  3. MAR,MCAR,MNAR
  4. Sensitivity analysis
  5. Imputation
  6. MICE/trees pre-processing

Time to Event (Survival) Analysis

  1. Visualizing Hazards Across Time
  2. Understanding the Log Rank Test
  3. Cox Proportional Hazards Modeling
  4. Understanding and interpreting the Hazard Ratio
  5. Model diagnostics and assumptions
  6. Implementing Time Varying Covariates
  7. Parametric Survival Models
  8. Weibull Model
  9. Exponential Model
  10. Predicting Failure Times

Causal Inference: The Potential Outcomes Framework

  1. Defining treatment effects (ATT, ATE)
  2. Identifying populations of interest
  3. Defining your causal hypothesis
  4. Understanding the counterfactual
  5. Establishing the causal diagram for your problem
  6. Different methods for conditioning on variables:
  7. Propensity Scores
  8. Direct regression adjustment
  9. G-computation formulas
  10. Instrumental variable analysis
TENHO INTERESSE

Cursos Relacionados

Curso Data Lake Inteligente Fundamentos para Analistas

16 horas

Curso Apache Spark and Scala

24 horas de curso pratico

Curso BigQuery Google Foudation

16 horas

Curso Bamboo Integração contínua

24 Horas

Curso Python 6 Projetos Python com Programacao Foundation to Advanced

60 horas

Curso Big Data Business Intelligence for Criminal Intelligence Analysis

40h

Curso Cloudera for Apache Kafka Overview

32 horas

Curso Cloudera Data Engineering Developing Applications with Apache Spark

32 horas