Cloudera Training for Apache HBase (CAHB) – Outline

Detailed Course Outline

Introduction

  • About this Course
  • About Cloudera
  • Course Logistics
  • Introductions

Introduction to Hadoop

  • What Is Big Data?
  • Introducing Hadoop
  • Hadoop Components

Introduction to HBase

  • What Is HBase?
  • Why Use HBase?
  • HBase and RDBMS
  • The Give and Take of HBase

HBase Concepts

  • HBase Concepts
  • Working with HBase

The HBase Administration API

  • HBase Shell
  • Creating Tables
  • HBase Jave API
  • Administration Calls

Accessing Data with the HBase API, Part 1

  • API Usage
  • Getting Data from the Shell, Java API, and Thrift API
  • Adding and Updating Data in the Shell
  • Driving Data from the Shell, Java API, and Thrift API

Accessing Data with the HBase API, Part 2

  • Adding and Updating Data with the API
  • The Scan API
  • Advanced API
  • Working with Eclipse

HBase Architecture, Part 1

  • Cluster Components
  • How HBase Scales

HBase Architecture, Part 2

  • HBase Write Paths
  • HBase Read Paths
  • Compactions and Splits

Installation and Configuration

  • HBase Installation
  • Hardware Considerations
  • HBase Configuration
  • MapReduce and HBase Clusters
  • Replication and Disaster Recovery

Row Key Design in HBase

  • From RDBMS to HBase Schema Design
  • Application-Centric Design
  • Row Key Design

Schema Design in HBase

  • Column Families
  • Schema Design Considerations
  • Hotspotting

The HBase Ecosystem

  • OpenTSDB
  • Kiji
  • HBase and Hive

Conclusion