March 24th, 2016

Connecting to Spark & Sparkling Water from R & Rstudio

Category: Uncategorized
Spark commands

Sparkling Water offers the best of breed machine learning for Spark users. Sparkling Water brings all of H2O’s advanced algorithms and capabilities to Spark. This means that you can continue to use H2O from Rstudio or any other ide of your choice. This post will walk you through the steps to get running on plain R or R studio from Spark.
It works just the same the same way as regular H2O. You just need to call h2o.init() from R with the right parameters i.e. IP, PORT
For example: we start sparkling shell (bin/sparkling-shell) here and create an H2OContext:
Spark commands
Now H2OContext is running and H2O’s REST API is exposed on 172.162.223:54321
So we can open RStudio and call h2o.init() (make sure you have the right R H2O package installed):
Rstudio-start
Let’s now create a Spark DataFrame, then publish it as H2O frame and access it from R:
This is how you achieve that in sparkling-shell:
val df = sc.parallelize(1 to 100).toDF // creates Spark DataFrame
val hf = h2oContext.asH2OFrame(df) // publishes DataFrame as H2O's Frame

Scala val df code
You can see that the name of the published frame is frame_rdd_6. Now let us go to RStudio and list all the available frames via h2o.ls() function:
Alternatively you could also name the frame during the transformation from Spark to H2O as shown below:
h2oContext.asH2OFrame(df) -> val hf = h2oContext.asH2OFrame(df, "simple.frame")
Rstudio-frames
We can fetch the frame as well or invoke a R function on it:
Rstudio-rdd
Keep hacking!

Leave a Reply

The Making of H2O Driverless AI – Automatic Machine Learning

It is my pleasure to share with you some never before exposed nuggets and insights

December 5, 2018 - by Arno Candel
Gratitude and thank you, makers!

Makers, Happy Thanksgiving - Hope you get to spend time with your loved ones this week. Thank them

November 21, 2018 - by Saurabh Kumar
New features in H2O 3.22

Xia Release (H2O 3.22) There's a new major release of H2O and it's packed with new

November 12, 2018 - by Jo-Fai Chow
Top 5 things you should know about H2O AI World London

We had a blast at H2O AI World London last week! With a record number

November 6, 2018 - by Bruna Smith
Fallback Featured Image
Anomaly Detection with Isolation Forests using H2O

Introduction Anomaly detection is a common data science problem where the goal is to identify odd

November 6, 2018 - by angela
Fallback Featured Image
Launching the Academic Program … OR … What Made My First Four Weeks at H2O.ai so Special!

We just launched the H2O.ai Academic Program at our sold-out H2O AI World London. With

October 30, 2018 - by Conrad

Join the AI Revolution

Subscribe, read the documentation, download or contact us.

Subscribe to the Newsletter

Start Your 21-Day Free Trial Today

Get It Now
Desktop img