Contents

Section Title Page
1 What is H2O? 5
2 Sparkling Water Introduction 6
2.1 Typical Use Cases 7
2.1.1 Model Building 7
2.1.2 Data Munging 8
2.1.3 Stream Processing 8
2.2 Features 8
2.3 Supported Data Sources 9
2.4 Supported Data Formats 10
2.5 Supported Spark Execution Environments 10
3 Design 11
3.1 Data Sharing between Spark and H2O 12
3.2 Provided Primitives 12
4 Sparkling Water Backends 16
4.1 Internal Backend 16
4.2 External Backend 16
4.2.1 Manual Mode of External Backend 17
4.2.2 Automatic Mode of External Backend 21
5 Programming API 23
5.1 Starting H2O Services 23
5.2 Memory Allocation 23
5.3 Converting H2OFrame into RDD[T] 24
5.4 Converting H2OFrame into DataFrame 24
5.5 Converting RDD[T] into H2OFrame 25
5.6 Converting DataFrame into H2OFrame 25
5.7 Creating H2OFrame from an Existing Key 26
5.8 Type Map Between H2OFrame and Spark DataFrame Types 26
5.9 Type mapping between H2O H2OFrame types and RDD[T] types 26
5.1 0 Calling H2O Algorithms 27
5.11 Using Spark Data Sources with H2OFrame 28
5.11.1 Reading from H2OFrame 28
5.11.2 Saving to H2OFrame 28
5.11.3 Loading and Saving Options 29
5.11.4 Specifying Saving Mode 29
6 Deployment 31
6.1 Referencing Sparkling Water 31
6.1.1 Using Fatjar 31
6.1.2 Using the Spark Package 32
6.2 Target Deployment Environments 33
6.2.1 Local cluster 33
6.2.2 On a Standalone Cluster 34
6.2.3 On a YARN Cluster 34
6.3 DataBricks Cloud 35
6.3.1 Creating a Cluster 35
6.3.2 Running Sparkling Water 36
6.3.3 Running PySparkling 37
6.4 Sparkling Water Configuration Properties 38
6.4.1 Configuration Properties not Dependent on Selected Backend 38
6.4.2 Internal Backend Configuration Properties 40
6.4.3 External Backend Configuration Properties 42
7 Building a Standalone Application 44
8 What is PySparkling Water? 46
8.1 Getting Started: 46
8.2 Using Spark Data Sources 48
8.2.1 Reading from H2OFrame 48
8.2.2 Saving to H2OFrame 48
8.2.3 Loading and Saving Options 49
9 A Use Case Example 50
9.1 Predicting Arrival Delay in Minutes – Regression 50
10 FAQ 54
11 References 58

 

Start Your 21-Day Free Trial Today

Get It Now
Desktop img