1.部署环境

  • OS:Red Hat Enterprise Linux Server release 6.4 (Santiago)
  • Hadoop:Hadoop 2.4.1
  • Hive:0.11.0
  • JDK:1.7.0_60
  • Python:2.6.6(spark集群需要python2.6以上,否则无法在spark集群上运行py)
  • Spark:0.9.1(最新版是1.1.0)
  • Shark:0.9.1(目前最新的版本,但是只能够兼容到spark-0.9.1,见shark 0.9.1 release
  • Zookeeper:2.3.5(配置HA时使用,Spark HA配置参见我的博文:Spark:Master High Availability(HA)高可用配置的2种实现
  • Scala:2.11.2

2.Spark集群规划

  • 账户:ebupt
  • master:eb174
  • slaves:eb174、eb175、eb176

3.建立ssh

  1. cd ~
  2. #生成公钥和私钥
  3. ssh-keygen -q -t rsa -N "" -f /home/ebupt/.ssh/id_rsa
  4. cd .ssh
  5. cat id_rsa.pub > authorized_keys
  6. chmod go-wx authorized_keys
  7. #把文件authorized_keys复制到所有子节点的/home/ebupt/.ssh目录下
  8. scp ~/.ssh/authorized_keys ebupt@eb175:~/.ssh/
  9. scp ~/.ssh/authorized_keys ebupt@eb176:~/.ssh/

另一个简单的方法:

由于实验室集群eb170可以ssh到所有的机器,因此直接拷贝eb170的~/.ssh/所有文件到eb174的~/.ssh/中。这样做的好处是不破坏原有的eb170的ssh免登陆。

  1. [ebupt@eb174 ~]$rm ~/.ssh/*
  2. [ebupt@eb170 ~]$scp -r ~/.ssh/ ebupt@eb174:~/.ssh/

4.部署scala,完全拷贝到所有节点

tar zxvf scala-2.11.2.tgz

ln -s /home/ebupt/eb/scala-2.11.2 ~/scala

vi ~/.bash_profile

  1. #添加环境变量
  2. export SCALA_HOME=$HOME/scala
  3. export PATH=$PATH:$SCALA_HOME/bin

通过scala –version便可以查看到当前的scala版本,说明scala安装成功。

[ebupt@eb174 ~]$ scala -version
Scala code runner version 2.11.2 -- Copyright 2002-2013, LAMP/EPFL

5.安装spark,完全拷贝到所有节点

解压建立软连接,配置环境变量,略。

[ebupt@eb174 ~]$ vi spark/conf/slaves

  1. #add the slaves
  2. eb174
  3. eb175
  4. eb176

[ebupt@eb174 ~]$ vi spark/conf/spark-env.sh

  1. export SCALA_HOME=/home/ebupt/scala
  2. export JAVA_HOME=/home/ebupt/eb/jdk1..0_60
  3. export SPARK_MASTER_IP=eb174
  4. export SPARK_WORKER_MEMORY=4000m

6.安装shark,完全拷贝到所有节点

解压建立软连接,配置环境变量,略。

[ebupt@eb174 ~]$ vi shark/conf/shark-env.sh

  1. export SPARK_MEM=1g
  2.  
  3. # (Required) Set the master program's memory
  4. export SHARK_MASTER_MEM=1g
  5.  
  6. # (Optional) Specify the location of Hive's configuration directory. By default,
  7. # Shark run scripts will point it to $SHARK_HOME/conf
  8. export HIVE_HOME=/home/ebupt/hive
  9. export HIVE_CONF_DIR="$HIVE_HOME/conf"
  10.  
  11. # For running Shark in distributed mode, set the following:
  12. export HADOOP_HOME=/home/ebupt/hadoop
  13. export SPARK_HOME=/home/ebupt/spark
  14. export MASTER=spark://eb174:7077
  15. # Only required if using Mesos:
  16. #export MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos.so
  17. source $SPARK_HOME/conf/spark-env.sh
  18.  
  19. #LZO compression native lib
  20. export LD_LIBRARY_PATH=/home/ebupt/hadoop/share/hadoop/common
  21.  
  22. # (Optional) Extra classpath
  23.  
  24. export SPARK_LIBRARY_PATH=/home/ebupt/hadoop/lib/native
  25.  
  26. # Java options
  27. # On EC2, change the local.dir to /mnt/tmp
  28. SPARK_JAVA_OPTS=" -Dspark.local.dir=/tmp "
  29. SPARK_JAVA_OPTS+="-Dspark.kryoserializer.buffer.mb=10 "
  30. SPARK_JAVA_OPTS+="-verbose:gc -XX:-PrintGCDetails -XX:+PrintGCTimeStamps "
  31. SPARK_JAVA_OPTS+="-XX:MaxPermSize=256m "
  32. SPARK_JAVA_OPTS+="-Dspark.cores.max=12 "
  33. export SPARK_JAVA_OPTS
  34.  
  35. # (Optional) Tachyon Related Configuration
  36. #export TACHYON_MASTER="" # e.g. "localhost:19998"
  37. #export TACHYON_WAREHOUSE_PATH=/sharktables # Could be any valid path name
  38. export SCALA_HOME=/home/ebupt/scala
  39. export JAVA_HOME=/home/ebupt/eb/jdk1..0_60

7.同步到slaves的脚本

7.1 master(eb174)的~/.bash_profile

  1. # .bash_profile
  2. # Get the aliases and functions
  3. if [ -f ~/.bashrc ]; then
  4. . ~/.bashrc
  5. fi
  6. # User specific environment and startup programs
  7. PATH=$PATH:$HOME/bin
  8. export PATH
  9.  
  10. export JAVA_HOME=/home/ebupt/eb/jdk1..0_60
  11. export PATH=$JAVA_HOME/bin:$PATH
  12. export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
  13.  
  14. export HADOOP_HOME=$HOME/hadoop
  15. export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
  16.  
  17. export ZOOKEEPER_HOME=$HOME/zookeeper
  18. export PATH=$ZOOKEEPER_HOME/bin:$PATH
  19.  
  20. export HIVE_HOME=$HOME/hive
  21. export PATH=$HIVE_HOME/bin:$PATH
  22.  
  23. export HBASE_HOME=$HOME/hbase
  24. export PATH=$PATH:$HBASE_HOME/bin
  25.  
  26. export MAVEN_HOME=$HOME/eb/apache-maven-3.0.
  27. export PATH=$PATH:$MAVEN_HOME/bin
  28.  
  29. export STORM_HOME=$HOME/storm
  30. export PATH=$PATH:$STORM_HOME/storm-yarn-master/bin:$STORM_HOME/storm-0.9.-wip21/bin
  31.  
  32. export SCALA_HOME=$HOME/scala
  33. export PATH=$PATH:$SCALA_HOME/bin
  34.  
  35. export SPARK_HOME=$HOME/spark
  36. export PATH=$PATH:$SPARK_HOME/bin
  37.  
  38. export SHARK_HOME=$HOME/shark
  39. export PATH=$PATH:$SHARK_HOME/bin

7.2 同步脚本:syncInstall.sh

  1. scp -r /home/ebupt/eb/scala-2.11. ebupt@eb175:/home/ebupt/eb/
  2. scp -r /home/ebupt/eb/scala-2.11. ebupt@eb176:/home/ebupt/eb/
  3. scp -r /home/ebupt/eb/spark-1.0.-bin-hadoop2 ebupt@eb175:/home/ebupt/eb/
  4. scp -r /home/ebupt/eb/spark-1.0.-bin-hadoop2 ebupt@eb176:/home/ebupt/eb/
  5. scp -r /home/ebupt/eb/spark-0.9.-bin-hadoop2 ebupt@eb175:/home/ebupt/eb/
  6. scp -r /home/ebupt/eb/spark-0.9.-bin-hadoop2 ebupt@eb176:/home/ebupt/eb/
  7. scp ~/.bash_profile ebupt@eb175:~/
  8. scp ~/.bash_profile ebupt@eb176:~/

7.3 配置脚本:build.sh

  1. #!/bin/bash
  2. source ~/.bash_profile
  3. ssh eb175 > /dev/null >& << eeooff
  4. ln -s /home/ebupt/eb/scala-2.11./ /home/ebupt/scala
  5. ln -s /home/ebupt/eb/spark-0.9.-bin-hadoop2/ /home/ebupt/spark
  6. ln -s /home/ebupt/eb/shark-0.9.-bin-hadoop2/ /home/ebupt/shark
  7. source ~/.bash_profile
  8. exit
  9. eeooff
  10. echo eb175 done!
  11. ssh eb176 > /dev/null >& << eeooffxx
  12. ln -s /home/ebupt/eb/scala-2.11./ /home/ebupt/scala
  13. ln -s /home/ebupt/eb/spark-0.9.-bin-hadoop2/ /home/ebupt/spark
  14. ln -s /home/ebupt/eb/shark-0.9.-bin-hadoop2/ /home/ebupt/shark
  15. source ~/.bash_profile
  16. exit
  17. eeooffxx
  18. echo eb176 done!

8 遇到的问题及其解决办法

8.1 安装shark-0.9.1和spark-1.0.2时,运行shark shell,执行sql报错。

  1. shark> select * from test;
  2. 17.096: [Full GC 71198K->24382K(506816K), 0.3150970 secs]
  3. Exception in thread "main" java.lang.VerifyError: class org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$SetOwnerRequestProto overrides final method getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
  4. at java.lang.ClassLoader.defineClass1(Native Method)
  5. at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
  6. at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
  7. at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
  8. at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
  9. at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
  10. at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
  11. at java.security.AccessController.doPrivileged(Native Method)
  12. at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
  13. at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
  14. at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
  15. at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
  16. at java.lang.Class.getDeclaredMethods0(Native Method)
  17. at java.lang.Class.privateGetDeclaredMethods(Class.java:2531)
  18. at java.lang.Class.privateGetPublicMethods(Class.java:2651)
  19. at java.lang.Class.privateGetPublicMethods(Class.java:2661)
  20. at java.lang.Class.getMethods(Class.java:1467)
  21. at sun.misc.ProxyGenerator.generateClassFile(ProxyGenerator.java:426)
  22. at sun.misc.ProxyGenerator.generateProxyClass(ProxyGenerator.java:323)
  23. at java.lang.reflect.Proxy.getProxyClass0(Proxy.java:636)
  24. at java.lang.reflect.Proxy.newProxyInstance(Proxy.java:722)
  25. at org.apache.hadoop.ipc.ProtobufRpcEngine.getProxy(ProtobufRpcEngine.java:92)
  26. at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:537)
  27. at org.apache.hadoop.hdfs.NameNodeProxies.createNNProxyWithClientProtocol(NameNodeProxies.java:334)
  28. at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:241)
  29. at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:141)
  30. at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:576)
  31. at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:521)
  32. at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:146)
  33. at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2397)
  34. at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
  35. at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431)
  36. at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413)
  37. at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
  38. at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
  39. at org.apache.hadoop.hive.ql.Context.getScratchDir(Context.java:180)
  40. at org.apache.hadoop.hive.ql.Context.getMRScratchDir(Context.java:231)
  41. at org.apache.hadoop.hive.ql.Context.getMRTmpFileURI(Context.java:288)
  42. at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1274)
  43. at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1059)
  44. at shark.parse.SharkSemanticAnalyzer.analyzeInternal(SharkSemanticAnalyzer.scala:137)
  45. at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:279)
  46. at shark.SharkDriver.compile(SharkDriver.scala:215)
  47. at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
  48. at org.apache.hadoop.hive.ql.Driver.run(Driver.java:909)
  49. at shark.SharkCliDriver.processCmd(SharkCliDriver.scala:338)
  50. at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
  51. at shark.SharkCliDriver$.main(SharkCliDriver.scala:235)
  52. at shark.SharkCliDriver.main(SharkCliDriver.scala)

原因:不知道它在说什么,大概是说“protobuf”版本有问题.

解决:找到 jar 包 “hive-exec-0.11.0-shark-0.9.1.jar” 在$SHARK_HOME/lib_managed/jars/edu.berkeley.cs.shark/hive-exec, 删掉有关protobuf,重新打包,该报错不再有,脚本如下所示。

  1. cd $SHARK_HOME/lib_managed/jars/edu.berkeley.cs.shark/hive-exec
  2. unzip hive-exec-0.11.-shark-0.9..jar
  3. rm -f com/google/protobuf/*
  4. rm hive-exec-0.11.0-shark-0.9.1.jar
  5. zip -r hive-exec-0.11.0-shark-0.9.1.jar *
  6. rm -rf com hive-exec-log4j.properties javaewah/ javax/ javolution/ META-INF/ org/

8.2  安装shark-0.9.1和spark-1.0.2时,spark集群正常运行,跑一下简单的job也是可以的,但是shark的job始终出现Spark cluster looks dead, giving up. 在运行shark-shell(shark-withinfo )时,都会看到连接不上spark的master。报错类似如下:

  1. shark> select * from t1;
  2. 16.452: [GC 282770K->32068K(1005568K), 0.0388780 secs]
  3. org.apache.spark.SparkException: Job aborted: Spark cluster looks down
  4. at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1028)
  5. at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1026)
  6. at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  7. at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
  8. at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1026)
  9. at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:619)
  10. at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:619)
  11. at scala.Option.foreach(Option.scala:236)
  12. at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:619)
  13. at org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:207)
  14. at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
  15. at akka.actor.ActorCell.invoke(ActorCell.scala:456)
  16. at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
  17. at akka.dispatch.Mailbox.run(Mailbox.scala:219)
  18. at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
  19. at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
  20. at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
  21. at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
  22. at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
  23. FAILED: Execution Error, return code -101 from shark.execution.SparkTask

原因:网上有很多人遇到同样的问题,spark集群是好的,但是shark就是不能够很好的运行。查看shark-0.9.1的release发现

  1. Release date: April 10, 2014
  2. Shark 0.9.1 is a maintenance release that stabilizes 0.9.0, which bumps up Scala compatibility to 2.10.3 and Hive compliance to 0.11. The core dependencies for this version are:
  3. Scala 2.10.3
  4. Spark 0.9.1
  5. AMPLabs Hive 0.9.0
  6. (Optional) Tachyon 0.4.1

这是因为shark版本只兼容到spark-0.9.1,版本不兼容导致无法找到spark集群的master服务。

解决:回退spark版本到spark-0.9.1,scala版本不用回退。回退后运行正常。

9.集群成功运行

9.1启动spark集群standalone模式

[ebupt@eb174 ~]$ ./spark/sbin/start-all.sh

9.2测试spark集群

[ebupt@eb174 ~]$ ./spark/bin/run-example org.apache.spark.examples.SparkPi 10 spark://eb174:7077

9.3 Spark Master UI:http://eb174:8080/

10 参考资料

  1. Apache Spark
  2. Apache Shark
  3. Shark安装部署与应用
  4. Spark github
  5. Shark github
  6. Spark 0.9.1和Shark 0.9.1分布式安装指南
  7. google group-shark users
  8. ERIC'S BLOG

Spark、Shark集群安装部署及遇到的问题解决的更多相关文章

  1. HBase集群安装部署

    0x01 软件环境 OS: CentOS6.5 x64 java: jdk1.8.0_111 hadoop: hadoop-2.5.2 hbase: hbase-0.98.24 0x02 集群概况 I ...

  2. flink部署操作-flink standalone集群安装部署

    flink集群安装部署 standalone集群模式 必须依赖 必须的软件 JAVA_HOME配置 flink安装 配置flink 启动flink 添加Jobmanager/taskmanager 实 ...

  3. HBase 1.2.6 完全分布式集群安装部署详细过程

    Apache HBase 是一个高可靠性.高性能.面向列.可伸缩的分布式存储系统,是NoSQL数据库,基于Google Bigtable思想的开源实现,可在廉价的PC Server上搭建大规模结构化存 ...

  4. 1.Hadoop集群安装部署

    Hadoop集群安装部署 1.介绍 (1)架构模型 (2)使用工具 VMWARE cenos7 Xshell Xftp jdk-8u91-linux-x64.rpm hadoop-2.7.3.tar. ...

  5. 2 Hadoop集群安装部署准备

    2 Hadoop集群安装部署准备 集群安装前需要考虑的几点硬件选型--CPU.内存.磁盘.网卡等--什么配置?需要多少? 网络规划--1 GB? 10 GB?--网络拓扑? 操作系统选型及基础环境-- ...

  6. K8S集群安装部署

    K8S集群安装部署   参考地址:https://www.cnblogs.com/xkops/p/6169034.html 1. 确保系统已经安装epel-release源 # yum -y inst ...

  7. 【分布式】Zookeeper伪集群安装部署

    zookeeper:伪集群安装部署 只有一台linux主机,但却想要模拟搭建一套zookeeper集群的环境.可以使用伪集群模式来搭建.伪集群模式本质上就是在一个linux操作系统里面启动多个zook ...

  8. 第06讲:Flink 集群安装部署和 HA 配置

    Flink系列文章 第01讲:Flink 的应用场景和架构模型 第02讲:Flink 入门程序 WordCount 和 SQL 实现 第03讲:Flink 的编程模型与其他框架比较 第04讲:Flin ...

  9. Hadoop2.2集群安装配置-Spark集群安装部署

    配置安装Hadoop2.2.0 部署spark 1.0的流程 一.环境描写叙述 本实验在一台Windows7-64下安装Vmware.在Vmware里安装两虚拟机分别例如以下 主机名spark1(19 ...

随机推荐

  1. android开发之定制ViewPager滑动事件

    明天还要加班,苦逼的程序猿,简单说说最近遇到的一个问题吧. 我在viewpager+fragment学习笔记中简单介绍过ViewPager+Fragment的用法,其实并不难,当时实现了一个如下图所示 ...

  2. dbartisan下载地址

    http://downloads.embarcadero.com/free/dbartisan

  3. 程序员带你十天快速入门Python,玩转电脑软件开发(二)

    关注今日头条-做全栈攻城狮,学代码也要读书,爱全栈,更爱生活.提供程序员技术及生活指导干货. 如果你真想学习,请评论学过的每篇文章,记录学习的痕迹. 请把所有教程文章中所提及的代码,最少敲写三遍,达到 ...

  4. .net+easyui系列--搜索框

    <input id="ss" style="width: 320px"> </input> <div id="mm&qu ...

  5. git 分布式版本控制了解

    今天也来了解一下这个版本控制神器,下面了解一些词语的意思 先说集中式版本系统,版本库是集中放在中央服务器的,干活的时候,都是用自己的电脑,从中央处理器取得最新的版本,干完活后,在把自己的活推送给服务器 ...

  6. 1,php概述

    学习了这么久的php,今天就跟着这本书,一章一章的去复习一下php的基础知识,个人理解如下:php是一门编写动态语言的web语言,能编写web语言的有好几种,但是人们都喜欢php,第一,php是开源的 ...

  7. ionic 项目分享【转】

    写在文章前:由于最近研究ionic框架,深感这块的Demo寥寥可数,而大家又都藏私,堂堂天朝,何时才有百家争鸣之象,开源精神吾辈当仁不让! 由于昨晚找资料太匆匆 忘记出处了,记得是在http://bb ...

  8. SVN配置使用

    文档规则 [本地工作区] :work copy ,本地工作副本: [主项目]:引用共用模块的新项目(工程) 最新版本(HEAD revision):版本库里文件或目录的最新版本 SA :SVN服务器的 ...

  9. AndroidStudio中 R文件缺失的办法

    AndroidStudio中 R文件缺失 找不到R文件的原因有如下两类: 1:IDE或代码问题,非个人原因: 2:个人误操作导致IDE不予提示R文件: 下面是解决办法: 第一种 ①首先确保资源文件是否 ...

  10. ZOJ 1423 (Your)((Term)((Project))) (模拟+数据结构)

    题目链接:http://acm.zju.edu.cn/onlinejudge/showProblem.do?problemId=423 Sample Input 3(A-B + C) - (A+(B ...