Santosh Srinivas

on 07 Nov 2016, tagged onApache Spark, Analytics, Data Minin

I've finally got to a long pending to-do-item to play with Apache Spark.

The following installation steps worked for me on Ubuntu 16.04.

The below options worked for me:

Download Apache Spark

  • Unzip and move Spark
cd ~/Downloads/
tar xzvf spark-2.0.1-bin-hadoop2.7.tgz
mv spark-2.0.1-bin-hadoop2.7/ spark
sudo mv spark/ /usr/lib/
  • Install SBT

As mentioned at sbt - Download

echo "deb https://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 2EE0EA64E40A89B84B2DF73499E82A75642AC823
sudo apt-get update
sudo apt-get install sbt
  • Make sure Java is installed

If not, install java

sudo apt-add-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
  • Configure Spark
cd /usr/lib/spark/conf/
cp spark-env.sh.template spark-env.sh
vi spark-env.sh

Add the following lines

JAVA_HOME=/usr/lib/jvm/java-8-oracle
SPARK_WORKER_MEMORY=4g
  • Configure IPv6

Basically, disable IPv6 using sudo vi /etc/sysctl.conf and add below lines

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
  • Configure .bashrc

I modified .bashrc in Sublime Text using subl ~/.bashrc and added the following lines

export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export SBT_HOME=/usr/share/sbt-launcher-packaging/bin/sbt-launch.jar
export SPARK_HOME=/usr/lib/spark
export PATH=$PATH:$JAVA_HOME/bin
export PATH=$PATH:$SBT_HOME/bin:$SPARK_HOME/bin:$SPARK_HOME/sbin
  • Configure fish (Optional - But I love the fish shell)

Modify config.fish using subl ~/.config/fish/config.fish and add the following lines

#Credit: http://fishshell.com/docs/current/tutorial.html#tut_startup

set -x PATH $PATH /usr/lib/spark
set -x PATH $PATH /usr/lib/spark/bin
set -x PATH $PATH /usr/lib/spark/sbin
  • Test Spark (Should work both in fish and bash)

Run pyspark (this is available in /usr/lib/spark/bin/) and test out.

For example ....

>>> a = 5
>>> b = 3
>>> a+b
8
>>> print(“Welcome to Spark”)
Welcome to Spark
## type Ctrl-d to exit

Try also, the built in run-example using run-example org.apache.spark.examples.SparkPi

That's it! You are ready to rock on using Apache Spark!

Next, I plan to checkout analysis using R as mentioned inhttp://www.milanor.net/blog/wp-content/uploads/2016/11/interactiveDataAnalysiswithSparkR_v5.pdf

Installing Apache Spark on Ubuntu 16.04的更多相关文章

  1. Install and Configure Apache Kafka on Ubuntu 16.04

    https://devops.profitbricks.com/tutorials/install-and-configure-apache-kafka-on-ubuntu-1604-1/ by hi ...

  2. Install LAMP Stack On Ubuntu 16.04

    原文:http://www.unixmen.com/how-to-install-lamp-stack-on-ubuntu-16-04/ LAMP is a combination of operat ...

  3. Ubuntu 16.04 LAMP server tutorial with Apache 2.4, PHP 7 and MariaDB (instead of MySQL)

    https://www.howtoforge.com/tutorial/install-apache-with-php-and-mysql-on-ubuntu-16-04-lamp/ This tut ...

  4. digitalocean --- How To Install Apache Tomcat 8 on Ubuntu 16.04

    https://www.digitalocean.com/community/tutorials/how-to-install-apache-tomcat-8-on-ubuntu-16-04 Intr ...

  5. 安装Hadoop及Spark(Ubuntu 16.04)

    安装Hadoop及Spark(Ubuntu 16.04) 安装JDK 下载jdk(以jdk-8u91-linux-x64.tar.gz为例) 新建文件夹 sudo mkdir /usr/lib/jvm ...

  6. 解决Ubuntu 16.04 上Android Studio2.3上面运行APP时提示DELETE_FAILED_INTERNAL_ERROR Error while Installing APKs的问题

    本人工作环境:Ubuntu 16.04 LTS + Android Studio 2.3 AVD启动之后,运行APP,报错提示: DELETE_FAILED_INTERNAL_ERROR Error ...

  7. Installing Moses on Ubuntu 16.04

    Installing Moses on Ubuntu 16.04 The process of installation To install requirements sudo apt-get in ...

  8. Installing Hyperledger Fabric v1.1 on Ubuntu 16.04 — Part I

    There is an entire library of Blockchain APIs which you can select according to the needs that suffi ...

  9. 如何在Ubuntu 16.04上安装Apache Web服务器

    转载自:https://www.howtoing.com/how-to-install-the-apache-web-server-on-ubuntu-16-04 介绍 Apache HTTP服务器是 ...

随机推荐

  1. 易普优APS签约本田汽车零部件八千代工业集团!

    2018年7月,易普优APS与八千代工业株式会社汽车零部件供应商正式签约,易普优APS在汽车零部件与整车行业的针对性解决方案的又一次得到客户高度认可与青睐! 日本八千代工业株式会社成立于1953年,总 ...

  2. python错误:UnicodeDecodeError: 'utf8' codec can't decode byte 0xe6 in position 0: unexpected end of data

    一.错误原因 在学习selenium自动化测试框架的时候,进行模仿浏览器搜索功能,输入英文是没问题,但是输入中文就报错,报错代码 def test_baidu_search(self): " ...

  3. R语言实战(十)处理缺失数据的高级方法

    本文对应<R语言实战>第15章:处理缺失数据的高级方法 本文仅在书的基础上进行简单阐述,更加详细的缺失数据问题研究将会单独写一篇文章. 处理缺失值的一般步骤: 识别缺失数据: 检查导致数据 ...

  4. FILE operattion

    #include "mainwindow.h"#include "ui_mainwindow.h"#include <QMessageBox>#in ...

  5. Java反射机制demo(四)—获取一个类的父类和实现的接口

    Java反射机制demo(四)—获取一个类的父类和实现的接口 1,Java反射机制得到一个类的父类 使用Class类中的getSuperClass()方法能够得到一个类的父类 如果此 Class 表示 ...

  6. Android自动化页面测速在美团的实践

    背景 随着移动互联网的快速发展,移动应用越来越注重用户体验.美团技术团队在开发过程中也非常注重提升移动应用的整体质量,其中很重要的一项内容就是页面的加载速度.如果发生冷启动时间过长.页面渲染时间过长. ...

  7. 初识Spring——Spring核心容器

    一. IOC和DI基础 IOC-Inversion of Control,译为控制反转,是一种遵循依赖倒置原则的代码设计思想. 所谓依赖倒置,就是把原本的高层建筑依赖底层建筑“倒置”过来,变成底层建筑 ...

  8. count 【mysql】

    如果你的需要是统计总行数时,为什么要使用count(*),而避免使用指定具体的列名? count()函数里面的参数是列名的的时候,那么会计算这个字段有值项的次数.也就是,该字段没有值的项并不会进入计算 ...

  9. Vue methods和computed

    <!DOCTYPE html> <html> <head> <meta charset="utf-8"> <title> ...

  10. SPOJ GSS

    GSS1 题目大意:给出一个数列,多次询问区间最长连续子段和 题解:线段树维护区间最长连续子段和gss,区间从最左元素开始的最长连续子段和lgss 区间以最右元素为结尾的最长连续子段和rgss以及区间 ...