1.修改拷贝/root/spark-1.5.1-bin-hadoop2.6/conf下面spark-env.sh.template到spark-env.sh,并添加设置HADOOP_CONF_DIR:

# Options read when launching programs locally with
# ./bin/run-example or ./bin/spark-submit
# - HADOOP_CONF_DIR, to point Spark towards Hadoop configuration files
# - SPARK_LOCAL_IP, to set the IP address Spark binds to on this node
# - SPARK_PUBLIC_DNS, to set the public dns name of the driver program
# - SPARK_CLASSPATH, default classpath entries to append
export HADOOP_CONF_DIR=/etc/hadoop/conf

2.运行测试程序

./bin/spark-submit --class org.apache.spark.examples.SparkPi \
--master yarn-cluster \
--num-executors \
--driver-memory 4g \
--executor-memory 2g \
--executor-cores \
--queue thequeue \
lib/spark-examples*.jar \

在运行时发现root用户没有hdfs目录/user/的写权限,导致任务失败:

Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.mkdirs(AuthorizationProviderProxyClientProtocol.java:)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:)
at org.apache.hadoop.ipc.Server$Handler$.run(Server.java:)
at org.apache.hadoop.ipc.Server$Handler$.run(Server.java:)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:)

修改/user目录的权限即可:

[root@node1 spark-1.5.-bin-hadoop2.]# sudo -u hdfs hdfs dfs -chmod  /user

重新运行:

[root@node1 spark-1.5.-bin-hadoop2.]# ./bin/spark-submit --class org.apache.spark.examples.SparkPi     --master yarn-cluster     --num-executors      --driver-memory 4g     --executor-memory 2g     --executor-cores      --queue thequeue     lib/spark-examples*.jar
// :: INFO client.RMProxy: Connecting to ResourceManager at node1/192.168.0.81:
// :: INFO yarn.Client: Requesting a new application from cluster with NodeManagers
// :: INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster ( MB per container)
// :: INFO yarn.Client: Will allocate AM container, with MB memory including MB overhead
// :: INFO yarn.Client: Setting up container launch context for our AM
// :: INFO yarn.Client: Setting up the launch environment for our AM container
// :: INFO yarn.Client: Preparing resources for our AM container
// :: WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
// :: INFO yarn.Client: Uploading resource file:/root/spark-1.5.-bin-hadoop2./lib/spark-assembly-1.5.-hadoop2.6.0.jar -> hdfs://node1:8020/user/root/.sparkStaging/application_1446368481906_0008/spark-assembly-1.5.1-hadoop2.6.0.jar
// :: INFO yarn.Client: Uploading resource file:/root/spark-1.5.-bin-hadoop2./lib/spark-examples-1.5.-hadoop2.6.0.jar -> hdfs://node1:8020/user/root/.sparkStaging/application_1446368481906_0008/spark-examples-1.5.1-hadoop2.6.0.jar
// :: INFO yarn.Client: Uploading resource file:/tmp/spark-72a1a44a-029c--acd1-6fbd44f5709a/__spark_conf__2902795872463320162.zip -> hdfs://node1:8020/user/root/.sparkStaging/application_1446368481906_0008/__spark_conf__2902795872463320162.zip
// :: INFO spark.SecurityManager: Changing view acls to: root
// :: INFO spark.SecurityManager: Changing modify acls to: root
// :: INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
// :: INFO yarn.Client: Submitting application to ResourceManager
// :: INFO impl.YarnClientImpl: Submitted application application_1446368481906_0008
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: ACCEPTED)
// :: INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -
queue: root.thequeue
start time:
final status: UNDEFINED
tracking URL: http://node1:8088/proxy/application_1446368481906_0008/
user: root
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: ACCEPTED)
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: ACCEPTED)
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: ACCEPTED)
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: ACCEPTED)
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: RUNNING)
// :: INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 192.168.0.83
ApplicationMaster RPC port:
queue: root.thequeue
start time:
final status: UNDEFINED
tracking URL: http://node1:8088/proxy/application_1446368481906_0008/
user: root
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: RUNNING)
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: RUNNING)
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: RUNNING)
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: RUNNING)
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: RUNNING)
// :: INFO yarn.Client: Application report for application_1446368481906_0008 (state: FINISHED)
// :: INFO yarn.Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 192.168.0.83
ApplicationMaster RPC port:
queue: root.thequeue
start time:
final status: SUCCEEDED
tracking URL: http://node1:8088/proxy/application_1446368481906_0008/A
user: root
// :: INFO util.ShutdownHookManager: Shutdown hook called
// :: INFO util.ShutdownHookManager: Deleting directory /tmp/spark-72a1a44a-029c--acd1-6fbd44f5709a
[root@node1 spark-1.5.-bin-hadoop2.]#

3.使用spark-sql

将/etc/hive/conf/hive-site.xml拷贝到/root/spark-1.5.1-bin-hadoop2.6/conf下,启动spark-sql即可

[root@node1 spark-1.5.-bin-hadoop2.]# cp /etc/hive/conf/hive-site.xml conf/
[root@node1 spark-1.5.-bin-hadoop2.]# ./bin/spark-sql

Spark-1.5.1 on CDH-5.4.7的更多相关文章

  1. spark on yarn 资源调度(cdh为例)

    一.CPU配置: ApplicationMaster 虚拟 CPU内核 yarn.app.mapreduce.am.resource.cpu-vcores ApplicationMaster占用的cp ...

  2. 【Spark】必须要用CDH版本的Spark?那你是不是需要重新编译?

    目录 为什么要重新编译? 步骤 一.下载Spark的源码 二.准备linux环境,安装必须软件 三.解压spark源码,修改配置,准备编译 四.开始编译 为什么要重新编译? 由于我们所有的环境统一使用 ...

  3. 1、Spark 2.1 源码编译支持CDH

    目前CDH支持的spark版本都是1.x, 如果想要使用spark 2x的版本, 只能编译spark源码生成支持CDH的版本. 一.准备工作 找一台Linux主机, 由于spark源码编译会下载很多的 ...

  4. Why Apache Spark is a Crossover Hit for Data Scientists [FWD]

    Spark is a compelling multi-purpose platform for use cases that span investigative, as well as opera ...

  5. 转:Sharethrough使用Spark Streaming优化实时竞价

    文章来自于:http://www.infoq.com/cn/news/2014/04/spark-streaming-bidding 来自于Sharethrough的数据基础设施工程师Russell ...

  6. CDH集群安装&测试总结

    0.绪论 之前完全没有接触过大数据相关的东西,都是书上啊,媒体上各种吹嘘啊,我对大数据,集群啊,分布式计算等等概念真是高山仰止,充满了仰望之情,觉得这些东西是这样的: 当我搭建的过程中,发现这些东西是 ...

  7. hive on spark

    hive on spark 的配置及设置CDH都已配置好,直接使用就行,但是我在用的时候报错,如下: 具体操作如下时报的错:      在hive 里执行以下命令:     set hive.exec ...

  8. hive使用spark引擎的几种情况

    使用spark引擎查询hive有以下几种方式:1>使用spark-sql(spark sql cli)2>使用spark-thrift提交查询sql3>使用hive on spark ...

  9. CDH集群spark-shell执行过程分析

    目的 刚入门spark,安装的是CDH的版本,版本号spark-core_2.11-2.4.0-cdh6.2.1,部署了cdh客户端(非集群节点),本文主要以spark-shell为例子,对在cdh客 ...

  10. 部署开启了Kerberos身份验证的大数据平台集群外客户端

    转载请注明出处 :http://www.cnblogs.com/xiaodf/ 本文档主要用于说明,如何在集群外节点上,部署大数据平台的客户端,此大数据平台已经开启了Kerberos身份验证.通过客户 ...

随机推荐

  1. ElasticSearch与Spring Boot集成问题

    1.None of the configured nodes are available 或者 org.elasticsearch.transport.RemoteTransportException ...

  2. asp.net mvc bundle中数组超出索引

    在使用bundle 来加载css的时候报错了, @Styles.Render("~/bundles/appStyles") 第一反应 以为是的css 太多了,可是当我这个style ...

  3. 使用Gson转换json数据为Java对象的一个例子

    记录工作中碰到的一个内容. 原料是微信平台的一个接口json数据. { "errcode" : 0, "errmsg" : "ok", &q ...

  4. 1005. Spell It Right (20)

    Given a non-negative integer N, your task is to compute the sum of all the digits of N, and output e ...

  5. Linux shell ”Press any key to continue ”功能实现

    function process_continue(){ SAVESTTY=`stty -g` stty cbreak dd if=/dev/tty bs=1 count=1 > /dev/nu ...

  6. linux网卡混杂模式

    混杂模式就是接收所有经过网卡的数据包,包括不是发给本机的包,即不验证MAC地址.普通模式下网卡只接收发给本机的包(包括广播包)传递给上层程序,其它的包一律丢弃.一般来说,混杂模式不会影响网卡的正常工作 ...

  7. springMvc 使用ajax上传文件,返回获取的文件数据 附Struts2文件上传

    总结一下 springMvc使用ajax文件上传 首先说明一下,以下代码所解决的问题 :前端通过input file 标签获取文件,通过ajax与后端交互,后端获取文件,读取excel文件内容,返回e ...

  8. linux环境初始化 用户问题

    linux 初始化系统配置(centos6) (2013-04-03 13:19:15) 转载▼   分类: linux 这篇博文是从别处转来的,原文地址http://zhoualine.iteye. ...

  9. [重要公告] 关于禁止发布Windows系统及非法激活软件的通知

    Skyfree 发表于 2013-11-15 09:45:17 https://www.itsk.com/thread-306891-1-1.html 接微软方面法务通知,要求删除涉及发布Win8/8 ...

  10. Spring中servletFileUpload完成上传文件以及文本的处理

    JSP: <%@ page language="java" contentType="text/html; charset=UTF-8" pageEnco ...