Spark Standalone Mode Configuration

For currently popular distributed framework Spark, here it shows the intro and steps to configure the spark standalone mode on several machines.

It is easy to configure it from stratch. The following instruction I note down is based on the spark-2.0.2-bin-hadoop2.7 as example on the linux debian machines for scala programming.

Assume you have two machines with IP: 192.168.0.51 and 192.168.0.52

1. Preinstall java, scala, sbt

check: https://www.scala-lang.org/download/install.html

http://www.scala-sbt.org/0.13/docs/Installing-sbt-on-Linux.html

2. Download prebuilt spark version with hadoop. or you can compile on your own

the link can be referenced: https://spark.apache.org/downloads.html

3. Unzip the file and create the link for easy visit later

e.g. execute: ln -s /usr/local/spark-2.0.2-bin-hadoop2.7 /usr/local/spark

4. Configure the spark environments:

(1) configure slaves file: /usr/local/spark-2.0.2-bin-hadoop2.7/conf/slaves

# A Spark Worker will be started on each of the machines listed below.

192.168.0.51

192.168.0.52

(2) configure spar_env.sh. e.g.

#spark-env.sh

export SCALA_HOME=/usr/local/scala

export JAVA_HOME=/home/local/jdk

#export SPARK_LOCAL_IP=localhost

export SPARK_EXECUTOR_MEMORY=6g

export SPARK_EXECUTOR_CORES=6

export SPARK_MASTER_IP=192.168.0.51

export SPARK_MASTER_PORT=8070

export SPARK_MASTER_WEBUI_PORT=8080

#export SPARK_WORKER_INSTANCES=1

export SPARK_WORKER_PORT=8092

#export SPARK_WORKER_MEMORY=4g

#export SPARK_WORKER_CORES=4

5. Set up passwordless ssh access key

(1) Generate ssh key without password

$ ssh-keygen -t rsa -P ""

(2) Copy id_rsa.pub to authorized-keys

$  cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

(3) Start ssh localhost if you want to work in only one localhost machine for spark standalone

$ ssh localhost

6. Start spark

$SPARK_HOME/sbin/start-all.sh

execute jps to check worker and master have been up

7. Write and run your application

execute: sbt package

execute: $SPARK_HOME/bin/spark-submit \

　　　　--class "main.scala.MainAppTest" \

　　　　--master local[4] \

　　　　　xxxxxxxx.jar

Spark Standalone Mode Configuration的更多相关文章

（二）win7下用Intelij IDEA 远程调试spark standalone 集群
关于这个spark的环境搭建了好久,踩了一堆坑,今天环境: WIN7笔记本 spark 集群(4个虚拟机搭建的) Intelij IDEA15 scala-2.10.4 java-1.7.0 版本 ...
【原】Spark Standalone模式
Spark Standalone模式安装Spark Standalone集群手动启动集群集群创建脚本提交应用到集群创建Spark应用资源调度及分配监控与日志与Hadoop共存配置网络 ...
Spark standalone模式的安装（spark-1.6.1-bin-hadoop2.6.tgz）（master、slave1和slave2）
前期博客 Spark运行模式概述 Spark standalone简介与运行wordcount(master.slave1和slave2) 开篇要明白 (1)spark-env.sh 是环境变量配 ...
spark standalone ha spark submit
when you build a spark standalone ha cluster, when you submit your app, you should send it to the l ...
Spark standalone HA
配置Spark standalone HA 主机:node1,node2,node3 master: node1,node2 slave:node2,node3 修改配置文件: node1,node3 ...
spark standalone zookeeper HA部署方式
虽然spark master挂掉的几率很低,不过还是被我遇到了一次.以前在spark standalone的文章中也介绍过standalone的ha,现在详细说下部署流程,其实也比较简单. 一.机器 ...
Windows下IntelliJ IDEA中运行Spark Standalone
ZHUAN http://www.cnblogs.com/one--way/archive/2016/08/29/5818989.html http://www.cnblogs.com/one--wa ...
Spark standalone安装（最小化集群部署）
Spark standalone安装-最小化集群部署(Spark官方建议使用Standalone模式) 集群规划: 主机 IP ...
Spark Standalone模式应用程序开发
作者:过往记忆 | 新浪微博:左手牵右手TEL | 能够转载, 但必须以超链接形式标明文章原始出处和作者信息及版权声明博客地址:http://www.iteblog.com/文章标题:<Spar ...

随机推荐

6. Java 加解密技术系列之 3DES
Java 加解密技术系列之 3DES 序背景概念原理代码实现结束语序上一篇文章讲的是对称加密算法 — — DES,这篇文章打算在 DES 的基础上,继续多讲一点,也就是 3 重 DES ...
java虚拟机学习-JVM调优总结-典型配置举例（10）
以下配置主要针对分代垃圾回收算法而言. 堆大小设置年轻代的设置很关键 JVM中最大堆大小有三方面限制:相关操作系统的数据模型(32-bt还是64-bit)限制:系统的可用虚拟内存限制:系统的可用物理 ...
PHP 关于timezone问题
Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use ...
初识mysql
一直想试试mysql,但是却一直没有正式的使用过它,也许是因为第一次安装时忘记了root密码,折腾太久留下的后遗症吧,总有点怕怕的.今天第一次使用命令行创建了数据库和数据表,虽然是简单的不能再简单的数 ...
Django框架全面讲解
Python的WEB框架有Django.Tornado.Flask 等多种,Django相较与其他WEB框架其优势为:大而全,框架本身集成了ORM.模型绑定.模板引擎.缓存.Session等诸多功能. ...
php curl 的几个实例
使用PHP的cURL库可以简单和有效地去抓网页.你只需要运行一个脚本,然后分析一下你所抓取的网页,然后就可以以程序的方式得到你想要的数据了.无论是你想从从一个链接上取部分数据,或是取一个XML文件并把 ...
（原创）用Java实现链表结构对象：单向无环链表
转载请注明本文出处:http://www.cnblogs.com/Starshot/p/6918569.html 链表的结构是由一个一个节点组成的,所谓链,就是每个节点的头尾连在一起.而单向链表就是: ...
poj1379
poj1379 题意给出 n 个洞的坐标,要求找到一点使得这一点距离最近洞的距离最远. 分析通过这道题学习一下模拟退火算法, 这种随机化的算法,在求解距离且精度要求较小时很有用. 简而言之,由随机 ...
Backbox Linux简介与配置内网IP
总体说起来,Backbox内置的工具什么的,并不是很多,但是它集成了一些用起来很棒的工具. 比如:Beef.Sqlmap.wpscan.zenmap.msf.w3af.dns嗅探等一系列工具,传说中的 ...
iOS工程师常用的命令行命令总结
感觉有点标题党了. 作为一个iOS工程师,没有做过服务端,主要用的是mac电脑,此篇博文是记录我在工作,学习的过程中用的命令行命令的记录和归纳总结一. mac命令行 1. cd /Users/xxx ...

Spark Standalone Mode Configuration

Spark Standalone Mode Configuration的更多相关文章

随机推荐

热门专题