Hadoop生态圈-Oozie实战之逻辑调度执行多个Job

                                      作者:尹正杰

版权声明:原创作品,谢绝转载!否则将追究法律责任。

1>.启动hadoop集群

[root@yinzhengjie hadoop-2.5.-cdh5.3.6]# sbin/start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh Starting namenodes on [s101]
s101: starting namenode, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/hadoop-root-namenode-yinzhengjie.out
s102: starting datanode, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/hadoop-root-datanode-s102.out
s104: starting datanode, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/hadoop-root-datanode-s104.out
s103: starting datanode, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/hadoop-root-datanode-s103.out
Starting secondary namenodes [s101]
s101: starting secondarynamenode, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/hadoop-root-secondarynamenode-yinzhengjie.out
starting yarn daemons
starting resourcemanager, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/yarn-yinzhengjie-resourcemanager-s101.out
s104: starting nodemanager, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/yarn-root-nodemanager-s104.out
s102: starting nodemanager, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/yarn-root-nodemanager-s102.out
s103: starting nodemanager, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/yarn-root-nodemanager-s103.out
[root@yinzhengjie hadoop-2.5.-cdh5.3.6]#

启动hadoop集群([root@yinzhengjie hadoop-2.5.0-cdh5.3.6]# sbin/start-all.sh )

[root@yinzhengjie hadoop-2.5.-cdh5.3.6]# sbin/mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/logs/mapred-yinzhengjie-historyserver-s101.out
[root@yinzhengjie hadoop-2.5.-cdh5.3.6]#

启动日志服务([root@yinzhengjie hadoop-2.5.0-cdh5.3.6]# sbin/mr-jobhistory-daemon.sh start historyserver)

[root@yinzhengjie oozie-4.0.-cdh5.3.6]# bin/oozied.sh start

Setting OOZIE_HOME:          /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6
Setting OOZIE_CONFIG: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/conf
Sourcing: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/conf/oozie-env.sh
setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"
Setting OOZIE_CONFIG_FILE: oozie-site.xml
Setting OOZIE_DATA: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/data
Setting OOZIE_LOG: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/logs
Setting OOZIE_LOG4J_FILE: oozie-log4j.properties
Setting OOZIE_LOG4J_RELOAD:
hostname: Name or service not known
Setting OOZIE_HTTP_HOSTNAME:
Setting OOZIE_HTTP_PORT:
Setting OOZIE_ADMIN_PORT:
Setting OOZIE_HTTPS_PORT:
Setting OOZIE_BASE_URL: http://:11000/oozie
Setting CATALINA_BASE: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/oozie-server
Setting OOZIE_HTTPS_KEYSTORE_FILE: /root/.keystore
Setting OOZIE_HTTPS_KEYSTORE_PASS: password
Setting OOZIE_INSTANCE_ID:
Setting CATALINA_OUT: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/logs/catalina.out
Setting CATALINA_PID: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/oozie-server/temp/oozie.pid Using CATALINA_OPTS: -Xmx1024m -Dderby.stream.error.file=/home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/logs/derby.log
Adding to CATALINA_OPTS: -Doozie.home.dir=/home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6 -Doozie.config.dir=/home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/conf -Doozie.log.dir=/home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/logs -Doozie.data.dir=/home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/data -Doozie.instance.id= -Doozie.config.file=oozie-site.xml -Doozie.log4j.file=oozie-log4j.properties -Doozie.log4j.reload= -Doozie.http.hostname= -Doozie.admin.port= -Doozie.http.port= -Doozie.https.port= -Doozie.base.url=http://:11000/oozie -Doozie.https.keystore.file=/root/.keystore -Doozie.https.keystore.pass=password -Djava.library.path= Using CATALINA_BASE: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/oozie-server
Using CATALINA_HOME: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/oozie-server
Using CATALINA_TMPDIR: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/oozie-server/temp
Using JRE_HOME: /soft/jdk
Using CLASSPATH: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/oozie-server/bin/bootstrap.jar
Using CATALINA_PID: /home/yinzhengjie/download/cdh/oozie-4.0.-cdh5.3.6/oozie-server/temp/oozie.pid
Existing PID file found during start.
Removing/clearing stale PID file.
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#

启动oozie([root@yinzhengjie oozie-4.0.0-cdh5.3.6]# bin/oozied.sh start)

[root@yinzhengjie oozie-4.0.-cdh5.3.6]# xcall.sh jps
============= s101 jps ============
ResourceManager
SecondaryNameNode
JobHistoryServer
Jps
Bootstrap
NameNode
命令执行成功
============= s102 jps ============
Jps
NodeManager
DataNode
命令执行成功
============= s103 jps ============
DataNode
Jps
NodeManager
命令执行成功
============= s104 jps ============
NodeManager
DataNode
Jps
命令执行成功
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#

查看进程是否启动成功([root@yinzhengjie oozie-4.0.0-cdh5.3.6]# xcall.sh jps)

  查看oozie界面是否启动成功:

2>.解压官方案例模板

[root@yinzhengjie oozie-4.0.-cdh5.3.6]#
[root@yinzhengjie oozie-4.0.-cdh5.3.6]# tar -zxf oozie-examples.tar.gz
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#

3>.编写脚本

[root@yinzhengjie oozie-4.0.-cdh5.3.6]# cat yinzhengjie-oozie-jobs/shell/test-.sh
#!/bin/bash
#@author :yinzhengjie
#blog:http://www.cnblogs.com/yinzhengjie
#EMAIL:y1053419035@qq.com /bin/date -d today +"%Y-%m-%d %T" > /home/yinzhengjie/data/access-.log
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#
[root@yinzhengjie oozie-4.0.-cdh5.3.6]# cat yinzhengjie-oozie-jobs/shell/test-.sh
#!/bin/bash
#@author :yinzhengjie
#blog:http://www.cnblogs.com/yinzhengjie
#EMAIL:y1053419035@qq.com /bin/date -d today +"%Y-%m-%d %T" > /home/yinzhengjie/data/access-.log
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#

4>.编辑job.properties配置文件

[root@yinzhengjie oozie-4.0.-cdh5.3.6]# more yinzhengjie-oozie-jobs/shell/job.properties
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# #HDFS地址
nameNode=hdfs://s101:8020 #ResourceManager地址
jobTracker=s101: #队列名称
queueName=default
examplesRoot=yinzhengjie-oozie-jobs #指定oozie的shell脚本存放路径
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/shell #指定执行的脚本名称
EXEC1=test-.sh
EXEC2=test-.sh
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#

5>.编辑workflow.xml 配置文件

 [root@yinzhengjie oozie-4.0.0-cdh5.3.6]# cat yinzhengjie-oozie-jobs/shell/workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.4" name="shell-wf">
<start to="yinzhengjie-shell-node1"/>
<action name="yinzhengjie-shell-node1">
<shell xmlns="uri:oozie:shell-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>${EXEC1}</exec>
<file>/user/root/yinzhengjie-oozie-jobs/shell/${EXEC1}#${EXEC1}</file>
<!-- <argument>my_output=Hello Oozie</argument>-->
<capture-output/>
</shell>
<ok to="yinzhengjie-shell-node2"/>
<error to="fail"/>
</action> <action name="yinzhengjie-shell-node2">
<shell xmlns="uri:oozie:shell-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>${EXEC2}</exec>
<file>/user/root/yinzhengjie-oozie-jobs/shell/${EXEC2}#${EXEC2}</file>
<!-- <argument>my_output=Hello Oozie</argument>-->
<capture-output/>
</shell>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
[root@yinzhengjie oozie-4.0.0-cdh5.3.6]#

6>.上传任务配置到hdfs

[root@yinzhengjie oozie-4.0.-cdh5.3.6]# /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/bin/hdfs dfs -put yinzhengjie-oozie-jobs/shell/ /user/root/yinzhengjie-oozie-jobs/shell
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#
[root@yinzhengjie oozie-4.0.-cdh5.3.6]# /home/yinzhengjie/download/cdh/hadoop-2.5.-cdh5.3.6/bin/hdfs dfs -ls -R /user/root/yinzhengjie-oozie-jobs/shell
-rw-r--r-- root supergroup -- : /user/root/yinzhengjie-oozie-jobs/shell/blog.sh
-rw-r--r-- root supergroup -- : /user/root/yinzhengjie-oozie-jobs/shell/job.properties
-rw-r--r-- root supergroup -- : /user/root/yinzhengjie-oozie-jobs/shell/test-.sh
-rw-r--r-- root supergroup -- : /user/root/yinzhengjie-oozie-jobs/shell/test-.sh
-rw-r--r-- root supergroup -- : /user/root/yinzhengjie-oozie-jobs/shell/workflow.xml
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#

[root@yinzhengjie oozie-4.0.0-cdh5.3.6]# /home/yinzhengjie/download/cdh/hadoop-2.5.0-cdh5.3.6/bin/hdfs dfs -put yinzhengjie-oozie-jobs/shell/ /user/root/yinzhengjie-oozie-jobs/shell

7>.执行任务

[root@yinzhengjie oozie-4.0.-cdh5.3.6]# bin/oozie job -oozie http://s101:11000/oozie -config yinzhengjie-oozie-jobs/shell/job.properties -run
job: --oozie-root-W
[root@yinzhengjie oozie-4.0.-cdh5.3.6]#

8>.

9>.

Hadoop生态圈-Oozie实战之逻辑调度执行多个Job的更多相关文章

  1. Hadoop生态圈-Oozie实战之调度shell脚本

    Hadoop生态圈-Oozie实战之调度shell脚本 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 本篇博客展示案例:使用Oozie调度Shell脚本. 1>.解压官方案例 ...

  2. Hadoop生态圈-Azkaban实战之Command类型执行指定脚本

    Hadoop生态圈-Azkaban实战之Command类型执行指定脚本 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 1>.服务端测试代码(别忘记添加权限哟!) [yinzh ...

  3. Hadoop生态圈-Oozie部署实战

    Hadoop生态圈-Oozie部署实战 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.Oozie简介 1>.什么是Oozie Oozie英文翻译为:驯象人.一个基于工作流 ...

  4. Hadoop生态圈-Azkaban实战之Command类型多job工作流flow

    Hadoop生态圈-Azkaban实战之Command类型多job工作流flow 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. Azkaban内置的任务类型支持command.ja ...

  5. Hadoop生态圈-Azkaban部署实战

    Hadoop生态圈-Azkaban部署实战 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任.  一.Azkaban部署流程 1>.上传azkaban程序并创建解压目录 [yinz ...

  6. Hadoop生态圈-Azkaban实现hive脚本执行

    Hadoop生态圈-Azkaban实现hive脚本执行 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 本篇博客中在HDFS分布式系统取的数据,而这个数据的是有之前我通过MapRed ...

  7. Hadoop生态圈-Azkaban实现文件上传到hdfs并执行MR数据清洗

    Hadoop生态圈-Azkaban实现文件上传到hdfs并执行MR数据清洗 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 如果你没有Hadoop集群的话也没有关系,我这里给出当时我 ...

  8. 一篇了解大数据架构及Hadoop生态圈

    一篇了解大数据架构及Hadoop生态圈 阅读建议,有一定基础的阅读顺序为1,2,3,4节,没有基础的阅读顺序为2,3,4,1节. 第一节 集群规划 大数据集群规划(以CDH集群为例),参考链接: ht ...

  9. hadoop生态圈介绍

    原文地址:大数据技术Hadoop入门理论系列之一----hadoop生态圈介绍   1. hadoop 生态概况 Hadoop是一个由Apache基金会所开发的分布式系统基础架构. 用户可以在不了解分 ...

随机推荐

  1. 基于SSH框架开发的《高校大学生选课系统》的质量属性的实现

    基于SSH框架开发的<高校大学生选课系统>的质量属性的实现 对于可用性采取的是错误预防战术,即阻止错误演变为故障:在本系统主要体现在以下两个方面:(1)对于学生登录模块,由于初次登陆,学生 ...

  2. jiedui

    源代码:https://github.com/hanzhaoyan/jieduizuoye/tree/master 功能要求: 该程序用图形界面实现下面功能:用计算机产生一个100以内的随机数,游戏者 ...

  3. input file multiple 批量上传文件

    这几天维护系统,有一个批量上传文件功能,出现了一点小问题 我的笔记本选择要上传的文件很正常 但在测试环境上,别人的电脑上,选择上传文件之后 一开始,以为是代码问题,网上找了很多的资料,但还是没用,然后 ...

  4. servlet请求转发

    来源:http://www.2cto.com/kf/201610/554591.html 请求转发:Servlet(源组件)先对客户请求做一些预处理操作(数据处理),然后把请求转发给其他Web组件(目 ...

  5. 现代程序设计 homework-01

    搞了6个小时individual project...看看博客做一做第一次现代程序设计作业 1) 建立 GitHub 账户, 把课上做的 “最大子数组之和” 程序签入 我的github地址是https ...

  6. PHP使用MySQL实现消息队列

    消息队列常用在流量削峰(秒杀场景),异步通信等地方. 大体的结构如下: 类似于消费者和生产者的关系,首先生产者在消息队列未满的时候,才将生产的产品放进消息队列中:消费者在消息队列不为空的时候,才从消息 ...

  7. 编写一个shell脚本来编译并运行java代码

    概述 编译和运行java分别要用到javac命令和java命令,虽然可以使用IDE(比如eclipse,InteliJ,NetBean...),按一下快捷键就可以实现编译并运行,但是,在之前还要配置一 ...

  8. 临时关闭Mysql ONLY_FULL_GROUP_BY

    /** * @author lcc807@ikoo8.com * * 临时关闭Mysql ONLY_FULL_GROUP_BY */ function closeSqlFullMode(){ DB:: ...

  9. mysql 由decimal 引起的 Warning: Data truncated for column

    今天在使用python 库mysqldb的rawsql的时候遇到一个问题(其实并不是mysqlbean引起的) cls.raw_sql('update {table} set available_am ...

  10. asp.net 的三种开发模式

    一, Web Pages 是三种创建 ASP.NET 网站和 Web 应用程序的编程模式中的一种. 其他两种编程模式是 Web Forms 和 MVC(Model View Controller 模型 ...