Distributed Deep Learning Models: Using TensorFlow and PyTorch on NVIDIA GPUs and Cluster of Raspberry PIs

Document
Document
    Item Description
    Abstract
    This thesis work focuses on distributed deep learning approaches implementing Human Activity Recognition (HAR) using Recurrent Neural Network (RNN) Long Short-Term Memory (LSTM) model using University of California at Irvine's machine learning database. This work includes developing the LSTM residual bidirectional architecture using Python 3 programming language over distributed TensorFlow and PyTorch programming frameworks on top of two testbed systems: the first one is Raspberry Pi cluster that is built upon 16 Raspberry Pis, clustered together by using parameter server architecture. Another one is the NVIDIA GPU cluster which is equipped with 3 GPUs named Tesla K40C, Quadro P5000 and Quadro K620. Here we compare and observe the performance of our deep learning algorithms in terms of execution time and prediction accuracy with varying number of deep layers with hidden neurons in the neural networks. Our first comparison is based on using TensorFlow and PyTorch over NVIDIA Maximus distributed multicore architecture. The second comparison is the execution of the Raspberry Pi cluster and Octa core Intel Xeon CPU. In this research we present that the implementations of distributed neural network over the GPU cluster perform better than the Raspberry Pi cluster and the multicore system.
    Note
    Ranbirsingh, Jagadish Kumar (author),(Haklin Kimm, PhD) (Thesis advisor),(Eun-Joo Lee, PhD) (Committee member),East Stroudsburg University of Pennsylvania Computer Science (Degree grantor),(Minhaz Chowdhury, PhD) (Committee member)
    Resource Type
    Institution