July 13th, 2016

H2O + TensorFlow on AWS GPU

Category: Deep Learning, Tutorials
Fallback Featured Image

TensorFlow on AWS GPU instance
In this tutorial, we show how to setup TensorFlow on AWS GPU instance and run H2O Tensorflow Deep learning demo.
Pre-requisites:
To get started, request an AWS EC2 instance with GPU support. We used a single g2.2xlarge instance running Ubuntu 14.04.To setup TensorFlow with GPU support, following softwares should be installed:

  1. Java 1.8
  2. Python pip
  3. Unzip utility
  4. CUDA Toolkit (>= v7.0)
  5. cuDNN (v4.0)
  6. Bazel (>= v0.2)
  7. TensorFlow (v0.9)

To run H2O Tensorflow Deep learning demo, following softwares should be installed:

  1. IPython notebook
  2. Scala
  3. Spark
  4. Sparkling water

Software Installation:
Java:


#To install Java follow below steps: Type ‘Y’ on installation prompt
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
Update JAVA_HOME in ~/.bashrc
#Add JAVA_HOME to PATH:
export PATH=$PATH:$JAVA_HOME/bin
# Execute following command to update current session:
source ~/.bashrc
#Verify version and path:
java -version
echo $JAVA_HOME

Python:


#AWS EC2 instance has Python installed by default. Verify if Python 2.7 is installed already:
python -V
#Install pip
sudo apt-get install python-pip
#Install IPython notebook
sudo pip install "ipython[notebook]"
#To run H2O example notebooks, execute following commands:
sudo pip install requests
sudo pip install tabulate

Unzip utility:


#Execute following command to install unzip
sudo apt-get install unzip

Scala:


#Follow below mentioned steps: Type ‘Y’ on installation prompt
sudo apt-get install scala
#Update SCALA_HOME in ~/.bashrc and execute following command to update current session:
source ~/.bashrc
#Verify version and path:
scala -version
echo $SCALA_HOME

Spark:


#Java and Scala should be installed before installing Spark.
#Get latest version of Spark binary:
wget http://apache.cs.utah.edu/spark/spark-1.6.1/spark-1.6.1-bin-hadoop2.6.tgz
#Extract the file:
tar xvzf spark-1.6.1-bin-hadoop2.6.tgz
#Update SPARK_HOME in ~/.bashrc and execute following command to update current session:
source ~/.bashrc
#Add SPARK_HOME to PATH:
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
#Verify the variables:
echo $SPARK_HOME

Sparkling Water:


#Latest Spark pre-built for Hadoop should be installed and point SPARK_HOME to it:
export SPARK_HOME="/path/to/spark/installation"
#To launch a local Spark cluster with 3 worker nodes with 2 cores and 1g per node, export MASTER variable
export MASTER="local-cluster[3,2,1024]"
#Download and run Sparkling Water
wget http://h2o-release.s3.amazonaws.com/sparkling-water/rel-1.6/5/sparkling-water-1.6.5.zip
unzip sparkling-water-1.6.5.zip
cd sparkling-water-1.6.5
bin/sparkling-shell --conf "spark.executor.memory=1g"

CUDA Toolkit:


#In order to build or run TensorFlow with GPU support, both NVIDIA’s Cuda Toolkit (>= 7.0) and cuDNN (>= v2) need to be installed.
#To install CUDA toolkit, run:
wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1410/x86_64/cuda-repo-ubuntu1410_7.0-28_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1410_7.0-28_amd64.deb
sudo apt-get update
sudo apt-get install cuda

cuDNN:


#To install cuDNN, download a file named cudnn-7.0-linux-x64-v4.0-prod.tgz after filling NVIDIA questionnaire.
#You need to transfer it to your EC2 instance’s home directory.
tar -zxf cudnn-7.0-linux-x64-v4.0-prod.tgz &&
rm cudnn-7.0-linux-x64-v4.0-prod.tgz
sudo cp -R cuda/lib64 /usr/local/cuda/lib64
sudo cp ~/cuda/include/cudnn.h /usr/local/cuda
#Reboot the system
sudo reboot
#Update environment variables as shown below:
export CUDA_HOME=/usr/local/cuda
export CUDA_ROOT=/usr/local/cuda
export PATH=$PATH:$CUDA_ROOT/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_ROOT/lib64

Bazel:


#To instal Bazel(>= v0.2), run:
sudo apt-get install pkg-config zip g++ zlib1g-dev
wget https://github.com/bazelbuild/bazel/releases/download/0.3.0/bazel-0.3.0-installer-linux-x86_64.sh
chmod +x bazel-0.3.0-installer-linux-x86_64.sh
./bazel-0.3.0-installer-linux-x86_64.sh --user

TensorFlow:


#Download and install TensorFlow:
wget https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.9.0rc0-cp27-none-linux_x86_64.whl
sudo pip install --upgrade tensorflow-0.9.0rc0-cp27-none-linux_x86_64.whl
#Configure TF with GPU support enabled using:
./configure

To build TensorFlow, run:


bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
sudo pip install --upgrade /tmp/tensorflow_pkg/tensorflow-0.8.0-py2-none-any.whl

Run H2O Tensorflow Deep learning demo:


#Since, we want to open IPython notebook remotely, we will use IP and port option. To start TensorFlow notebook:
cd sparkling-water-1.6.5/ 
IPYTHON_OPTS="notebook --no-browser --ip='*' --port=54321" bin/pysparkling #Note that port specified in above command should be open in the system. Open http://PublicIP:8888 in browser to start IPython notebook console. Click on TensorFlowDeepLearning.ipynb Refer this video for demo details. #Sample .bashrc contents: export JAVA_HOME=/usr/lib/jvm/java-8-oracle export SCALA_HOME=/usr/share/java export SPARK_HOME=/home/ubuntu/spark-1.6.1-bin-hadoop2.6 export MASTER="local-cluster[3,2,1024]" export PATH=$PATH:$JAVA_HOME/bin:$SPARK_HOME/bin:$SPARK_HOME/sbin export CUDA_HOME=/usr/local/cuda export CUDA_ROOT=/usr/local/cuda export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/jvm/java-8-oracle/bin:/home/ubuntu/spark-1.6.1-bin-hadoop2.6/bin:/home/ubuntu/spark-1.6.1-bin-hadoop2.6/sbin:/usr/local/cuda/bin:/home/ubuntu/bin export LD_LIBRARY_PATH=:/usr/local/cuda/lib64

Troubleshooting:
1) ERROR: Getting java.net.UnknownHostException while starting spark-shell
Solution:
Make sure /etc/hosts has entry for hostname.
Eg: 127.0.0.1 hostname
2) ERROR: Getting Could not find .egg-info directory in install record error during IPython installation
Solution:

sudo pip install --upgrade setuptools pip

3) ERROR: Can’t find swig while configuring TF
Solution:

sudo apt-get install swig

4) ERROR: “Ignoring gpu device (device: 0, name: GRID K520, pci bus id: 0000:00:03.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5”
Solution:
Specify 3.0 while configuring TF at:
Please note that each additional compute capability significantly increases your build time and binary size.
5) ERROR: Could not insert ‘nvidia_352’: Unknown symbol in module, or unknown parameter (see dmesg)
Solution:

sudo apt-get install linux-image-extra-virtual

6) ERROR: Cannot find ’./util/python/python_include
Solution:

sudo apt-get install python-dev

7) Find Public IP address of system
Solution:

curl http://169.254.169.254/latest/meta-data/public-ipv4

Demo Videos

Leave a Reply

New features in H2O 3.22

Xia Release (H2O 3.22) There's a new major release of H2O and it's packed with new

November 12, 2018 - by Jo-Fai Chow
Top 5 things you should know about H2O AI World London

We had a blast at H2O AI World London last week! With a record number

November 6, 2018 - by Bruna Smith
Fallback Featured Image
Anomaly Detection with Isolation Forests using H2O

Introduction Anomaly detection is a common data science problem where the goal is to identify odd

November 6, 2018 - by angela
Fallback Featured Image
Launching the Academic Program … OR … What Made My First Four Weeks at H2O.ai so Special!

We just launched the H2O.ai Academic Program at our sold-out H2O AI World London. With

October 30, 2018 - by Conrad
Welcome H2O.ai’s new Driverless AI Community!

I am very excited to announce the formation of the inaugural community for H2O Driverless

October 30, 2018 - by Rafael Coss
Fallback Featured Image
How This AI Tool Breathes New Life Into Data Science

Ask any data scientist in your workplace. Any Data Science Supervised Learning ML/AI project will

October 16, 2018 - by Saurabh Kumar

Join the AI Revolution

Subscribe, read the documentation, download or contact us.

Subscribe to the Newsletter

Start Your 21-Day Free Trial Today

Get It Now
Desktop img