Spark-1.5.1 on CDH-5.4.7
1.修改拷贝/root/spark-1.5.1-bin-hadoop2.6/conf下面spark-env.sh.template到spark-env.sh,并添加设置HADOOP_CONF_DIR:
# Options read when launching programs locally with
# ./bin/run-example or ./bin/spark-submit
# - HADOOP_CONF_DIR, to point Spark towards Hadoop configuration files
# - SPARK_LOCAL_IP, to set the IP address Spark binds to on this node
# - SPARK_PUBLIC_DNS, to set the public dns name of the driver program
# - SPARK_CLASSPATH, default classpath entries to append
export HADOOP_CONF_DIR=/etc/hadoop/conf
2.运行测试程序
./bin/spark-submit --class org.apache.spark.examples.SparkPi \
--master yarn-cluster \
--num-executors \
--driver-memory 4g \
--executor-memory 2g \
--executor-cores \
--queue thequeue \
lib/spark-examples*.jar \
在运行时发现root用户没有hdfs目录/user/的写权限,导致任务失败:
Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.mkdirs(AuthorizationProviderProxyClientProtocol.java:)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:)
at org.apache.hadoop.ipc.Server$Handler$.run(Server.java:)
at org.apache.hadoop.ipc.Server$Handler$.run(Server.java:)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:)
修改/user目录的权限即可:
[root@node1 spark-1.5.-bin-hadoop2.]# sudo -u hdfs hdfs dfs -chmod /user
重新运行:
[root@node1 spark-1.5.-bin-hadoop2.]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors --driver-memory 4g --executor-memory 2g --executor-cores --queue thequeue lib/spark-examples*.jar
// :: INFO client.RMProxy: Connecting to ResourceManager at node1/192.168.0.81:
// :: INFO yarn.Client: Requesting a new application from cluster with NodeManagers
// :: INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster ( MB per container)
// :: INFO yarn.Client: Will allocate AM container, with MB memory including MB overhead
// :: INFO yarn.Client: Setting up container launch context for our AM
// :: INFO yarn.Client: Setting up the launch environment for our AM container
// :: INFO yarn.Client: Preparing resources for our AM container
// :: WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
// :: INFO yarn.Client: Uploading resource file:/root/spark-1.5.-bin-hadoop2./lib/spark-assembly-1.5.-hadoop2.6.0.jar -> hdfs://node1:8020/user/root/.sparkStaging/application_1446368481906_0008/spark-assembly-1.5.1-hadoop2.6.0.jar
// :: INFO yarn.Client: Uploading resource file:/root/spark-1.5.-bin-hadoop2./lib/spark-examples-1.5.-hadoop2.6.0.jar -> hdfs://node1:8020/user/root/.sparkStaging/application_1446368481906_0008/spark-examples-1.5.1-hadoop2.6.0.jar
// :: INFO yarn.Client: Uploading resource file:/tmp/spark-72a1a44a-029c--acd1-6fbd44f5709a/__spark_conf__2902795872463320162.zip -> hdfs://node1:8020/user/root/.sparkStaging/application_1446368481906_0008/__spark_conf__2902795872463320162.zip
// :: INFO spark.SecurityManager: Changing view acls to: root
// :: INFO spark.SecurityManager: Changing modify acls to: root
// :: INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
// :: INFO yarn.Client: Submitting application to ResourceManager
// :: INFO impl.YarnClientImpl: Submitted application application_1446368481906_0008
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: ACCEPTED)
// :: INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -
queue: root.thequeue
start time:
final status: UNDEFINED
tracking URL: http://node1:8088/proxy/application_1446368481906_0008/
user: root
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: ACCEPTED)
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: ACCEPTED)
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: ACCEPTED)
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: ACCEPTED)
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: RUNNING)
// :: INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 192.168.0.83
ApplicationMaster RPC port:
queue: root.thequeue
start time:
final status: UNDEFINED
tracking URL: http://node1:8088/proxy/application_1446368481906_0008/
user: root
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: RUNNING)
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: RUNNING)
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: RUNNING)
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: RUNNING)
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: RUNNING)
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: FINISHED)
// :: INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 192.168.0.83
ApplicationMaster RPC port:
queue: root.thequeue
start time:
final status: SUCCEEDED
tracking URL: http://node1:8088/proxy/application_1446368481906_0008/A
user: root
// :: INFO util.ShutdownHookManager: Shutdown hook called
// :: INFO util.ShutdownHookManager: Deleting directory /tmp/spark-72a1a44a-029c--acd1-6fbd44f5709a
[root@node1 spark-1.5.-bin-hadoop2.]#
3.使用spark-sql
将/etc/hive/conf/hive-site.xml拷贝到/root/spark-1.5.1-bin-hadoop2.6/conf下,启动spark-sql即可
[root@node1 spark-1.5.-bin-hadoop2.]# cp /etc/hive/conf/hive-site.xml conf/
[root@node1 spark-1.5.-bin-hadoop2.]# ./bin/spark-sql
Spark-1.5.1 on CDH-5.4.7的更多相关文章
- spark on yarn 资源调度(cdh为例)
一.CPU配置: ApplicationMaster 虚拟 CPU内核 yarn.app.mapreduce.am.resource.cpu-vcores ApplicationMaster占用的cp ...
- 【Spark】必须要用CDH版本的Spark?那你是不是需要重新编译?
目录 为什么要重新编译? 步骤 一.下载Spark的源码 二.准备linux环境,安装必须软件 三.解压spark源码,修改配置,准备编译 四.开始编译 为什么要重新编译? 由于我们所有的环境统一使用 ...
- 1、Spark 2.1 源码编译支持CDH
目前CDH支持的spark版本都是1.x, 如果想要使用spark 2x的版本, 只能编译spark源码生成支持CDH的版本. 一.准备工作 找一台Linux主机, 由于spark源码编译会下载很多的 ...
- Why Apache Spark is a Crossover Hit for Data Scientists [FWD]
Spark is a compelling multi-purpose platform for use cases that span investigative, as well as opera ...
- 转:Sharethrough使用Spark Streaming优化实时竞价
文章来自于:http://www.infoq.com/cn/news/2014/04/spark-streaming-bidding 来自于Sharethrough的数据基础设施工程师Russell ...
- CDH集群安装&测试总结
0.绪论 之前完全没有接触过大数据相关的东西,都是书上啊,媒体上各种吹嘘啊,我对大数据,集群啊,分布式计算等等概念真是高山仰止,充满了仰望之情,觉得这些东西是这样的: 当我搭建的过程中,发现这些东西是 ...
- hive on spark
hive on spark 的配置及设置CDH都已配置好,直接使用就行,但是我在用的时候报错,如下: 具体操作如下时报的错: 在hive 里执行以下命令: set hive.exec ...
- hive使用spark引擎的几种情况
使用spark引擎查询hive有以下几种方式:1>使用spark-sql(spark sql cli)2>使用spark-thrift提交查询sql3>使用hive on spark ...
- CDH集群spark-shell执行过程分析
目的 刚入门spark,安装的是CDH的版本,版本号spark-core_2.11-2.4.0-cdh6.2.1,部署了cdh客户端(非集群节点),本文主要以spark-shell为例子,对在cdh客 ...
- 部署开启了Kerberos身份验证的大数据平台集群外客户端
转载请注明出处 :http://www.cnblogs.com/xiaodf/ 本文档主要用于说明,如何在集群外节点上,部署大数据平台的客户端,此大数据平台已经开启了Kerberos身份验证.通过客户 ...
随机推荐
- ElasticSearch与Spring Boot集成问题
1.None of the configured nodes are available 或者 org.elasticsearch.transport.RemoteTransportException ...
- asp.net mvc bundle中数组超出索引
在使用bundle 来加载css的时候报错了, @Styles.Render("~/bundles/appStyles") 第一反应 以为是的css 太多了,可是当我这个style ...
- 使用Gson转换json数据为Java对象的一个例子
记录工作中碰到的一个内容. 原料是微信平台的一个接口json数据. { "errcode" : 0, "errmsg" : "ok", &q ...
- 1005. Spell It Right (20)
Given a non-negative integer N, your task is to compute the sum of all the digits of N, and output e ...
- Linux shell ”Press any key to continue ”功能实现
function process_continue(){ SAVESTTY=`stty -g` stty cbreak dd if=/dev/tty bs=1 count=1 > /dev/nu ...
- linux网卡混杂模式
混杂模式就是接收所有经过网卡的数据包,包括不是发给本机的包,即不验证MAC地址.普通模式下网卡只接收发给本机的包(包括广播包)传递给上层程序,其它的包一律丢弃.一般来说,混杂模式不会影响网卡的正常工作 ...
- springMvc 使用ajax上传文件,返回获取的文件数据 附Struts2文件上传
总结一下 springMvc使用ajax文件上传 首先说明一下,以下代码所解决的问题 :前端通过input file 标签获取文件,通过ajax与后端交互,后端获取文件,读取excel文件内容,返回e ...
- linux环境初始化 用户问题
linux 初始化系统配置(centos6) (2013-04-03 13:19:15) 转载▼ 分类: linux 这篇博文是从别处转来的,原文地址http://zhoualine.iteye. ...
- [重要公告] 关于禁止发布Windows系统及非法激活软件的通知
Skyfree 发表于 2013-11-15 09:45:17 https://www.itsk.com/thread-306891-1-1.html 接微软方面法务通知,要求删除涉及发布Win8/8 ...
- Spring中servletFileUpload完成上传文件以及文本的处理
JSP: <%@ page language="java" contentType="text/html; charset=UTF-8" pageEnco ...