启动PySpark

export IPYTHON= # PySpark也可使用IPython shell
pyspark --master yarn --num-executors

发生如下错误:

/opt/cloudera/parcels/CDH-5.3.-.cdh5.3.3.p0./bin/../lib/spark/bin/pyspark: line : exec: ipython: not found

原因是没有按照ipython,在google所有一下安装的方法,到网站http://continuum.io/downloads#all上下载Linux-64位的Anaconda版本,

wget https://3230d63b5fc54e62148e-c95ac804525aac4b6dba79b00b39d1d3.ssl.cf1.rackcdn.com/Anaconda-2.3.0-Linux-x86_64.sh

下载完成之后,运行如下命令进行安装:

bash Anaconda-2.3.-Linux-x86_64.sh

上述命令会进入一个类似shell的交互式环境,安装过程中都选择yes即可。安装完成之后,在安装ipython,Anaconda在交换式环境中输入:

pip install ipython

这时候会安装一大堆的包。安装完成之后会提示是否修改环境变量,输入yes,安装脚本自动将如下脚本写入~/.bashrc中的尾部

export PATH=/root/anaconda/bin:$PATH

运行source ~/.bashrc使新环境变量生效。

source ~/.bashrc

再次运行pyspark命令即可,如下:

pyspark --master yarn --num-executors 

得到如下输出:

Type "copyright", "credits" or "license" for more information.

IPython 3.2. -- An enhanced Interactive Python.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
// :: INFO SecurityManager: Changing view acls to: root
// :: INFO SecurityManager: Changing modify acls to: root
// :: INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
// :: INFO Slf4jLogger: Slf4jLogger started
// :: INFO Remoting: Starting remoting
// :: INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@ip-172-31-25-243.us-west-2.compute.internal:50324]
// :: INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkDriver@ip-172-31-25-243.us-west-2.compute.internal:50324]
// :: INFO Utils: Successfully started service 'sparkDriver' on port .
// :: INFO SparkEnv: Registering MapOutputTracker
// :: INFO SparkEnv: Registering BlockManagerMaster
// :: INFO DiskBlockManager: Created local directory at /tmp/spark-local--7afc
// :: INFO MemoryStore: MemoryStore started with capacity 265.4 MB
// :: INFO HttpFileServer: HTTP File server directory is /tmp/spark-e2f4c4e3-dc9b-4db0-8fd4-1dcbb8819b05
// :: INFO HttpServer: Starting HTTP Server
// :: INFO Utils: Successfully started service 'HTTP file server' on port .
// :: INFO Utils: Successfully started service 'SparkUI' on port .
// :: INFO SparkUI: Started SparkUI at http://ip-172-31-25-243.us-west-2.compute.internal:4040
// :: INFO RMProxy: Connecting to ResourceManager at ip----.us-west-.compute.internal/172.31.25.243:
// :: INFO Client: Requesting a new application from cluster with NodeManagers
// :: INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster ( MB per container)
// :: INFO Client: Will allocate AM container, with MB memory including MB overhead
// :: INFO Client: Setting up container launch context for our AM
// :: INFO Client: Preparing resources for our AM container
// :: INFO Client: Setting up the launch environment for our AM container
// :: INFO SecurityManager: Changing view acls to: root
// :: INFO SecurityManager: Changing modify acls to: root
// :: INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
// :: INFO Client: Submitting application to ResourceManager
// :: INFO YarnClientImpl: Submitted application application_1436008024626_0001
// :: INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
// :: INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -
queue: root.root
start time:
final status: UNDEFINED
tracking URL: http://ip-172-31-25-243.us-west-2.compute.internal:8088/proxy/application_1436008024626_0001/
user: root
// :: INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
// :: INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
// :: INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
// :: INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
// :: INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
// :: INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
// :: INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
// :: INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
// :: INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
// :: INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
// :: INFO Client: Application report for application_1436008024626_0001 (state: ACCEPTED)
// :: INFO YarnClientSchedulerBackend: ApplicationMaster registered as Actor[akka.tcp://sparkYarnAM@ip-172-31-25-246.us-west-2.compute.internal:45245/user/YarnAM#-1546613765]
// :: INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> ip----.us-west-.compute.internal, PROXY_URI_BASES -> http://ip-172-31-25-243.us-west-2.compute.internal:8088/proxy/application_1436008024626_0001), /proxy/application_1436008024626_0001
// :: INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
// :: INFO Client: Application report for application_1436008024626_0001 (state: RUNNING)
// :: INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: ip----.us-west-.compute.internal
ApplicationMaster RPC port:
queue: root.root
start time:
final status: UNDEFINED
tracking URL: http://ip-172-31-25-243.us-west-2.compute.internal:8088/proxy/application_1436008024626_0001/
user: root
// :: INFO YarnClientSchedulerBackend: Application application_1436008024626_0001 has started running.
// :: INFO NettyBlockTransferService: Server created on
// :: INFO BlockManagerMaster: Trying to register BlockManager
// :: INFO BlockManagerMasterActor: Registering block manager ip----.us-west-.compute.internal: with 265.4 MB RAM, BlockManagerId(<driver>, ip----.us-west-.compute.internal, )
// :: INFO BlockManagerMaster: Registered BlockManager
// :: INFO EventLoggingListener: Logging events to hdfs://ns-ha/user/spark/applicationHistory/application_1436008024626_0001
// :: INFO YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@ip-172-31-25-245.us-west-2.compute.internal:43360/user/Executor#1565208330] with ID 1
// :: INFO RackResolver: Resolved ip----.us-west-.compute.internal to /default
// :: INFO YarnClientSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@ip-172-31-25-244.us-west-2.compute.internal:39613/user/Executor#-786460899] with ID 2
// :: INFO RackResolver: Resolved ip----.us-west-.compute.internal to /default
// :: INFO BlockManagerMasterActor: Registering block manager ip----.us-west-.compute.internal: with 530.3 MB RAM, BlockManagerId(, ip----.us-west-.compute.internal, )
// :: INFO BlockManagerMasterActor: Registering block manager ip----.us-west-.compute.internal: with 530.3 MB RAM, BlockManagerId(, ip----.us-west-.compute.internal, )
// :: INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: (ms)
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 1.2.-SNAPSHOT
/_/ Using Python version 2.7. (default, May ::)
SparkContext available as sc. In []:

AAS代码运行-第11章-1的更多相关文章

  1. AAS代码运行-第11章-2

    hdfs dfs -ls /user/littlesuccess/AdvancedAnalysisWithSparkhdfs dfs -mkdir /user/littlesuccess/Advanc ...

  2. AAS代码运行-第4章

    [root@node1 aas]# ls ch02 ch03 spark--bin-hadoop2. spark--bin-hadoop2..tgz [root@node1 aas]# cd spar ...

  3. 第11章 Windows线程池(1)_传统的Windows线程池

    第11章 Windows线程池 11.1 传统的Windows线程池及API (1)线程池中的几种底层线程 ①可变数量的长任务线程:WT_EXECUTELONGFUNCTION ②Timer线程:调用 ...

  4. 高性能Linux服务器 第11章 构建高可用的LVS负载均衡集群

    高性能Linux服务器 第11章 构建高可用的LVS负载均衡集群 libnet软件包<-依赖-heartbeat(包含ldirectord插件(需要perl-MailTools的rpm包)) l ...

  5. Linux就这个范儿 第11章 独霸网络的蜘蛛神功

    Linux就这个范儿 第11章  独霸网络的蜘蛛神功  第11章 应用层 (Application):网络服务与最终用户的一个接口.协议有:HTTP FTP TFTP SMTP SNMP DNS表示层 ...

  6. 第11章 享元模式(Flyweight Pattern)

    原文 第11章 享元模式(Flyweight Pattern) 概述:   面向对象的思想很好地解决了抽象性的问题,一般也不会出现性能上的问题.但是在某些情况下,对象的数量可能会太多,从而导致了运行时 ...

  7. 翻译连载 | 第 11 章:融会贯通 -《JavaScript轻量级函数式编程》 |《你不知道的JS》姊妹篇

    原文地址:Functional-Light-JS 原文作者:Kyle Simpson-<You-Dont-Know-JS>作者 关于译者:这是一个流淌着沪江血液的纯粹工程:认真,是 HTM ...

  8. 【STM32H7教程】第11章 STM32H7移植SEGGER的硬件异常分析

    完整教程下载地址:http://forum.armfly.com/forum.php?mod=viewthread&tid=86980 第11章       STM32H7移植SEGGER的硬 ...

  9. Java核心技术卷一基础知识-第11章-异常、断言、日志和调试-读书笔记

    第11章 异常.断言.日志和调试 本章内容: * 处理错误 * 捕获异常 * 使用异常机制的技巧 * 使用断言 * 日志 * 调试技巧 * GUI程序排错技巧 * 使用调试器 11.1 处理错误 如果 ...

随机推荐

  1. pandas处理数据1

    读文件 pd.read_csv('path/to/file.txt',header=0,names='ab',index_col=0) names Columns这个可以不写,制定索引列是第一列,这样 ...

  2. SAP ERP和ORACLE ERP的区别是哪些?

    SAP有非常细致深入的标准流程,在流程方面,只要是你想到的问题,SAP都会曾经遇到过并且给出过解决方案.实施SAP比较倾向于改变企业的业务流程来匹配SAP,实现标准化管理. oracle比较灵活,侧重 ...

  3. WPF界面布局——Canvas

    Canvas用于定义一个区域,称为画布,用于完全控制每个元素的精确位置.它是布局控件中最为简单的一种,直接将元素放在指定位置,使用Canvas时,必须指定一个子元素的位置(相对于Canvas),否则所 ...

  4. Control.DataBinding数据绑定简单用法:

    DataBindings的用法: 第一个值:要绑定到TextBox的什么地方 第二个值:数据源是什么 第三个值:应该取数据源的什么属性 第四个值:是否开启数据格式化 第五个值:在什么时候启用数据源绑定 ...

  5. easyui layout 布局title

    <script> function aclick(){ $("a").click(function () { var name=this.innerHTML; $($( ...

  6. spring 声明式事务管理

    简单理解事务: 比如你去ATM机取5000块钱,大体有两个步骤:首先输入密码金额,银行卡扣掉5000元钱:然后ATM出5000元钱.这两个步骤必须是要么都执行要么都不执行.如果银行卡扣除了5000块但 ...

  7. (转)px、em、rem的区别和使用

    国内的设计师大都喜欢用px,而国外的网站大都喜欢用em和rem(国外的大部分网站能够调整的原因在于其使用了em或rem作为字体单位),那么三者有什么区别,又各自有什么优劣呢? 一.px特点 1. IE ...

  8. 查看Linux是32位还是64位

    最直接简洁的办法: 在linux终端输入getconf LONG_BIT命令 如果是32位机器,则结果为32 [root@localhost ~]# getconf LONG_BIT 32 如果是64 ...

  9. .net网站发布后的没有权限及被上传asp漏洞等问题

    前一阶段网站移到阿里云上,发现在线支付出现了问题,也接收不到银行返回的支付信息. 检查了源代码,发现是和支付有关的加密文件位置不对了,以前是放在e盘,现在新的是放在d盘,位置的信息是写死在代码中的.找 ...

  10. C++回溯法走迷宫

    #include <iostream> #include <iomanip> #include <cstdlib> using namespace std; #de ...