hadoop中日志聚集问题

遇到的问题：

当点击上面的logs时，会出现下面问题：

这个解决方案为：

By default, Hadoop stores the logs of each container in the node where that container was hosted. While this is irrelevant if you're just testing some Hadoop executions in a single-node environment (as all the logs will be in your machine anyway), with a cluster of nodes, keeping track of the logs can become quite a bother. In addition, since logs are kept on the normal filesystem, you may run into storage problems if you keep logs for a long time or have heterogeneous storage capabilities.

Log aggregation is a new feature that allows Hadoop to store the logs of each application in a central directory in HDFS. To activate it, just add the following to yarn-site.xmland restart the Hadoop services:

 <property>

    <description>Whether to enable log aggregation</description>

    <name>yarn.log-aggregation-enable</name>

    <value>true</value>

  </property>

By adding this option, you're telling Hadoop to move the application logs to hdfs:///logs/userlogs/<your user>/<app id>. You can change this path and other options related to log aggregation by specifying some other properties mentioned in the default yarn-site.xml (just do a search for log.aggregation).

However, these aggregated logs are not stored in a human readable format so you can't just cat their contents. Fortunately, Hadoop developers have included several handy command line tools for reading them:

# Read logs from any YARN application

$HADOOP_HOME/bin/yarn logs -applicationId <applicationId>

# Read logs from MapReduce jobs

$HADOOP_HOME/bin/mapred job -logs <jobId>

# Read it in a scrollable window with search (type '/' followed by your query).

$HADOOP_HOME/bin/yarn logs -applicationId <applicationId> | less

# Or just save it to a file and use your favourite editor

$HADOOP_HOME/bin/yarn logs -applicationId <applicationId> > log.txt

You can also access these logs via a web app for MapReduce jobs by using the JobHistory daemon. This daemon can be started/stopped by running the following:

# Start JobHistory daemon

$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver

# Stop JobHistory daemon

$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver

My Fabric script includes an optional variable for setting the node where to launch this daemon so it is automatically started/stopped when you run fab start or fab stop.

Unfortunately, a generic history daemon for universal web access to aggregated logs does not exist yet. However, as you can see by checking YARN-321, there's considerable work being done in this area. When this gets introduced I'll update this section.

hadoop中日志聚集问题的更多相关文章

Hadoop基础-完全分布式模式部署yarn日志聚集功能
Hadoop基础-完全分布式模式部署yarn日志聚集功能作者:尹正杰版权声明:原创作品,谢绝转载!否则将追究法律责任. 其实我们不用配置也可以在服务器后台通过命令行的形式查看相应的日志,但为了更方 ...
hadoop配置历史服务器&&配置日志聚集
配置历史服务器 1.在mapred-site.xml中写入一下配置 <property> <name>mapreduce.jobhistory.address</name ...
hadoop 3.x 配置日志聚集功能
打开$HADOOP_HOME/etc/hadoop/yarn-site.xml,增加以下配置(在此配置文件中尽量不要使用中文注释)  <property> ...
开启spark日志聚集功能
spark监控应用方式: 1)在运行过程中可以通过web Ui:4040端口进行监控 2)任务运行完成想要监控spark,需要启动日志聚集功能开启日志聚集功能方法: 编辑conf/spark-env ...
Yarn 的日志聚集功能配置使用
需要 hadoop 的安装目录/etc/hadoop/yarn-site.xml 中进行配置配置内容 <property> <name>yarn.log-aggregati ...
5，Hadoop中的文件
1,文件结构 · bin:脚本和命令目录. · etc:配置文件目录. · sbin:命令目录,主要包含HDFS和YARN中各类服务的启动和关闭,依赖于bin中的脚本. · share:各个模块编译后 ...
再谈SQL Server中日志的的作用
简介之前我已经写了一个关于SQL Server日志的简单系列文章.本篇文章会进一步挖掘日志背后的一些概念,原理以及作用.如果您没有看过我之前的文章,请参阅: 浅谈SQL Server ...
Hive分析hadoop进程日志
想把hadoop的进程日志导入hive表进行分析,遂做了以下的尝试. 关于hadoop进程日志的解析使用正则表达式获取四个字段,一个是日期时间,一个是日志级别,一个是类,最后一个是详细信息, 然后在 ...
hadoop中常见元素的解释
secondarynamenode 图: secondarynamenode根据文件的的大小对namenode的编辑日志和镜像日志进行合并. 光从字面上来理解,很容易让一些初学者先入为主的认为:Se ...

随机推荐

【hadoop2.6.0】一句话形容mapreduce
网上看到的: We want to count all the books in the library. You count up shelf #1, I count up shelf #2. Th ...
移动开发之meta篇
<meta name="viewport" content="width=device-width, initial-scale=1, user-scalable= ...
linux入门教程(十) 文档的压缩与打包
在windows下我们接触最多的压缩文件就是.rar格式的了.但在linux下这样的格式是不能识别的,它有自己所特有的压缩工具.但有一种文件在windows和linux下都能使用那就是.zip格式的文 ...
lintcode：previous permutation上一个排列
题目上一个排列给定一个整数数组来表示排列,找出其上一个排列. 样例给出排列[1,3,2,3],其上一个排列是[1,2,3,3] 给出排列[1,2,3,4],其上一个排列是[4,3,2,1] 注意 ...
*[topcoder]TheMatrix
http://community.topcoder.com/stat?c=problem_statement&pm=13035 矩阵棋盘的题,比较典型.可以定两条column夹住,然后上下扫, ...
iOS开发--多线程
前面在<Bison眼中的iOS开发多线程是这样的(二)>一文中讲完了多线程的NSThread,不难发现这种方式的多线程实现起来非常的复杂,为了简化多线程的开发,iOS提供了GCD来实现多线 ...
如何发布使用LGPL版Qt的商业软件
最近做跨平台图形用户界面库选型,权衡很多因素后最终选择了Qt,其中一个重要因素就是Qt使用LGPL授权许可.由于本人对LGPL理解有限,始终对闭源商业软件如何发布Qt库存在疑问,其中最关心的是:发布的 ...
oracle 修改表的sql语句
oracle 修改表的sql语句 1增加一个列:ALTER TABLE 表名 ADD(列名数据类型);如:ALTER TABLE emp ADD(license varchar2(256)) ...
231. Power of Two
题目: Given an integer, write a function to determine if it is a power of two. 链接: http://leetcode.com ...
Filter登录验证过滤器（全局）
通过Filter来定义一个登录验证过滤器,这是就不需要在每一个JSP页面添加判断用户合法性的代码了. 以下示例中包含了5个文件,一个是登录表单LoginForm.jsp,一个是登录判断页LoginCo ...

hadoop中日志聚集问题

hadoop中日志聚集问题的更多相关文章

随机推荐

热门专题