Learn Analytics Materials

Microsoft learning materials for data science.

View the Project on GitHub

Perform Data Engineering on Microsoft Azure HDInsight Community Guide

Administer and Provision HDInsight Clusters

  1. Deploy HDInsight Clusters

  2. Deploy Secure Multi-User HDInsight Clusters

  3. Ingest data for batch and interactive processing

  4. Configure HDInsight Clusters

  5. Manage and Debug HDInsight Jobs

Implement Big Data Batch Processing Solutions

  1. Implement batch solutions with Hive and Apache Pig

  2. Design Batch ETL Solutions for Big Data with Spark

  3. Operationalize Hadoop and Spark

Implement Big Data Interactive Processing Solutions

  1. Implement interactive queries for big data with Spark SQL

  2. Perform exploratory data analysis by using Spark SQL
  3. Implement interactive queries for big data with Interactive Hive
  4. Perform interactive processing by using Apache Phoenix on HBase

Implement Big Data Real-Time Processing Solutions

  1. Create Spark streaming applications using DStream API

  2. Create Spark structured streaming applications

  3. Develop big data real-time processing solutions with Apache Storm

  4. Build solutions that use Kafka

  5. Build solutions that use HBase