Visão Geral
Este Curso BigData on Google Cloud Platform, ministrada por instrutor apresenta aos participantes os recursos de Big Data e Machine Learning do Google Cloud Platform. Ele fornece uma visão geral rápida do Google Cloud Platform e um mergulho mais profundo nos recursos de processamento de dados.
Conteúdo Programatico
Introduction
In this module you will be introduced to Google Cloud Platform and the data handling aspects of the platform.
- What is the Google Cloud Platform?
- GCP Big Data Products
- Usage scenarios
- Lab: Sign up for Google Cloud Platform
Foundation of Google Cloud Platform
In this module, we introduce the foundations of the Google Cloud Platform: compute and storage and introduce how they work to provide data ingest, storage, and federated analysis.
- CPUs on demand (Compute Engine)
- Lab: Start Google Compute Engine instance, ssh access
- A global filesystem (Cloud Storage)
- Lab: Set up a Ingest-Transform-Publish data processing pipeline
- CloudShell
Data Analytics on the Cloud
In this module we introduce the common Big Data use cases that Google will manage for you. These are the things that are widely done in industry today and for which we provide easy migration to the cloud.
- Stepping stones to the cloud
- CloudSQL: your SQL database on the cloud
- Lab: importing data into CloudSQL and running queries on rentals data
- Dataproc
- Lab: Machine Learning with SparkML
Scaling data analysis
This module is about the more transformational technologies in Google Cloud platform that may not have immediate parallels to technologies that attendees are using (“what’s next”).
- Fast random access
- Datalab
- Demo: Sample notebook in datalab
- BigQuery
- Lab: Build machine learning dataset
- Machine Learning with TensorFlow
- Lab: Train and use neural network
- Fully built models for common needs
- Lab: Translate
- Genomics API (optional)
Data processing architectures
In this module we will introduce you to data processing architectures in Google Cloud Platform.
- Asynchronous processing with TaskQueues
- Message-oriented architectures with Pub/Sub
- Creating pipelines with Dataflow
Summary
- Why GCP?
- Where to go from here
- Resources