Cloudera Training for Apache HBase

Course Details	Find Out More
Code	CLD-HBASE
Tuition (CAD)	Array
Tuition (USD)	Array

Take your knowledge to the next level with Cloudera Training for Apache HBase. Cloudera Educational Services’ three-day training course enables participants to store and access massive quantities of multi-structured data and perform hundreds of thousands of operations per second.

Who Can Benefit

This course is appropriate for developers and administrators who intend to use HBase.

Skills Gained

During this course, you will learn how to:
Determine whether HBase is appropriate for a given use case
Employ best practices for schema and row key design
Create, populate, alter, and remove HBase tables
Add, locate, retrieve, update, and delete data
Perform common cluster administration tasks
Identify and resolve performance bottlenecks

Prerequisites

Prior experience with databases and data modeling is helpful, but not required. Knowledge of Java is assumed. Prior knowledge of Hadoop is not required, but Cloudera Developer Training for Spark and Hadoop provides an excellent foundation for this course.

Course Content

Introduction to Hadoop and HBase

Introducing Hadoop
Core Hadoop Components
Exercise: Using HDFS
What is HBase?
Strengths of HBase
HBase in Production
Weaknesses of HBase

HBase Tables

HBase Concepts
HBase Table Fundamentals
Thinking About Table Design
Exercise: HBase Data Import

HBase Shell

Creating Tables with the HBase Shell
Working with Tables
Exercise: Using the HBase Shell
Working with Table Data
Exercise: Data Access in the HBase Shell

HBase Architecture Fundamentals

HBase Regions
HBase Cluster Architecture
HBase and HDFS Data Locality

HBase Schema Design

General Design Considerations
Application-Centric Design
Designing HBase Row Keys
Other HBase Table Features
Exercise: Using MIN_VERSIONS and Time-To-Live
General Design Considerations
Application-Centric Design
Designing HBase Row Keys

Basic Data Access with the HBase API

Options to Access HBase Data
Creating and Deleting HBase Tables
Retrieving Data with Get
Retrieving Data with Scan
Inserting and Updating Data
Deleting Data
Exercise: Using the Developer API

More Advanced HBase API Features

Filtering Scans
Exercise: HBase Filters
Client-Side Write Buffer
Exercise: Using Client-Side Write Buffer
Best Practices
HBase Coprocessors
Exercise: Using Atomic Counters

HBase Write Path

HBase Write Path
Exercise: Exploring HBase
Compaction
Splits
Exercise: Flushes and Compactions

HBase Read Path

How HBase Reads Data
Block Caches for Reading

HBase Performance Tuning

Column Family Considerations
Schema Design Considerations
Configuring for Caching
Memory Considerations
Dealing with Time Series and Sequential Data
Pre-Splitting Regions
Exercise: DetectingHot Spots

HBase Administration and Cluster Management

HBase Schema Design
General Design Considerations
Application-Centric Design
Designing HBase Row Keys

HBase Replication and Backup

HBase Replication
HBase Backup
MapReduce and HBase Clusters
Exercise: Administration

Using Hive and Impala with HBase

How to use Hive and Impala to Access HBase
Exercise: Hive and HBase

Appendix A: Accessing Data with Python and Thrift

Thrift Usage
Working with Tables
Getting and Putting Data
ScanningData
Deleting Data
Counters
Filters
Optional Exercise: Using Python and Thrift with HBase

Appendix B: OpenTSDB

Appendix C: hbase-spark API

Introduction
Architecture and Integration Patterns
Typing and API Usage
Future Work
Optional Exercise: Using hbase-sparkAPI

Find Out More

Facebook
This field is for validation purposes and should be left unchanged.
Name
First Last
Email
Course
Additional Information

We'd love to work with you!

Not Yet

Good Fit

Ideal Fit