Flatten a Nested Parquet File via Sparkling Water

This article applies to Sparkling Water for h2o versions and later. After setting up Sparkling Water for your environment follow these steps: 1. Start sparkling-shell from the Sparkling Water folder: bin/sparkling-shell 2. Import the parquet file: Java ...

Data Preparation / Wrangling / Munging / Manipulation

What is Data Preparation / Data Wrangling / Data Munging / Data Manipulation? The process of transforming raw data into another format, which is more appropriate and valuable for analytics, is called data preparation / wrangling / munging / manipulation. Data preparation includes...

Unlabeled Examples

What are Unlabeled Examples? Unlabeled examples are datasets where we only have input variables (x). The target/outcome variable (y) is not yet known at the time of analysis. For example, human-created artifacts like photos and videos are unlabeled examples as they are not created with a...

Labeled Examples

What are Labeled Examples? A machine learning model defines the relationship between features and labels . When models are trained, you can train a model by feeding it examples. Examples are a particular instance of data. You can have two types of examples: labeled and unlabeled . Labeled...