遇到的问题:

当点击上面的logs时,会出现下面问题:

这个解决方案为:

By default, Hadoop stores the logs of each container in the node where that container was hosted. While this is irrelevant if you're just testing some Hadoop executions in a single-node environment (as all the logs will be in your machine anyway), with a cluster of nodes, keeping track of the logs can become quite a bother. In addition, since logs are kept on the normal filesystem, you may run into storage problems if you keep logs for a long time or have heterogeneous storage capabilities.

Log aggregation is a new feature that allows Hadoop to store the logs of each application in a central directory in HDFS. To activate it, just add the following to yarn-site.xmland restart the Hadoop services:

 <property>
<description>Whether to enable log aggregation</description>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>

By adding this option, you're telling Hadoop to move the application logs to hdfs:///logs/userlogs/<your user>/<app id>. You can change this path and other options related to log aggregation by specifying some other properties mentioned in the default yarn-site.xml (just do a search for log.aggregation).

However, these aggregated logs are not stored in a human readable format so you can't just cat their contents. Fortunately, Hadoop developers have included several handy command line tools for reading them:

# Read logs from any YARN application
$HADOOP_HOME/bin/yarn logs -applicationId <applicationId> # Read logs from MapReduce jobs
$HADOOP_HOME/bin/mapred job -logs <jobId> # Read it in a scrollable window with search (type '/' followed by your query).
$HADOOP_HOME/bin/yarn logs -applicationId <applicationId> | less # Or just save it to a file and use your favourite editor
$HADOOP_HOME/bin/yarn logs -applicationId <applicationId> > log.txt

You can also access these logs via a web app for MapReduce jobs by using the JobHistory daemon. This daemon can be started/stopped by running the following:

# Start JobHistory daemon
$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver
# Stop JobHistory daemon
$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver

My Fabric script includes an optional variable for setting the node where to launch this daemon so it is automatically started/stopped when you run fab start or fab stop.

Unfortunately, a generic history daemon for universal web access to aggregated logs does not exist yet. However, as you can see by checking YARN-321, there's considerable work being done in this area. When this gets introduced I'll update this section.

hadoop中日志聚集问题的更多相关文章

  1. Hadoop基础-完全分布式模式部署yarn日志聚集功能

    Hadoop基础-完全分布式模式部署yarn日志聚集功能 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 其实我们不用配置也可以在服务器后台通过命令行的形式查看相应的日志,但为了更方 ...

  2. hadoop配置历史服务器&&配置日志聚集

    配置历史服务器 1.在mapred-site.xml中写入一下配置 <property> <name>mapreduce.jobhistory.address</name ...

  3. hadoop 3.x 配置日志聚集功能

    打开$HADOOP_HOME/etc/hadoop/yarn-site.xml,增加以下配置(在此配置文件中尽量不要使用中文注释) <!--logs--> <property> ...

  4. 开启spark日志聚集功能

    spark监控应用方式: 1)在运行过程中可以通过web Ui:4040端口进行监控 2)任务运行完成想要监控spark,需要启动日志聚集功能 开启日志聚集功能方法: 编辑conf/spark-env ...

  5. Yarn 的日志聚集功能配置使用

    需要  hadoop 的安装目录/etc/hadoop/yarn-site.xml 中进行配置 配置内容 <property> <name>yarn.log-aggregati ...

  6. 5,Hadoop中的文件

    1,文件结构 · bin:脚本和命令目录. · etc:配置文件目录. · sbin:命令目录,主要包含HDFS和YARN中各类服务的启动和关闭,依赖于bin中的脚本. · share:各个模块编译后 ...

  7. 再谈SQL Server中日志的的作用

    简介     之前我已经写了一个关于SQL Server日志的简单系列文章.本篇文章会进一步挖掘日志背后的一些概念,原理以及作用.如果您没有看过我之前的文章,请参阅:     浅谈SQL Server ...

  8. Hive分析hadoop进程日志

    想把hadoop的进程日志导入hive表进行分析,遂做了以下的尝试. 关于hadoop进程日志的解析 使用正则表达式获取四个字段,一个是日期时间,一个是日志级别,一个是类,最后一个是详细信息, 然后在 ...

  9. hadoop中常见元素的解释

    secondarynamenode 图: secondarynamenode根据文件的的大小对namenode的编辑日志和镜像日志 进行合并. 光从字面上来理解,很容易让一些初学者先入为主的认为:Se ...

随机推荐

  1. [转] 浅析HTTP协议

    浅析HTTP协议 来源:http://www.cnblogs.com/gpcuster/archive/2009/05/25/1488749.html HTTP协议是什么? 简单来说,就是一个基于应用 ...

  2. struts2学习笔记(2)——简单的输入验证以及标签库的运用

    struts2自带了一些标签库,运用好这些标签库会减少开发周期. 1.struts2标签库是在哪定义的? struts2标签库定义在struts2-core-2.1.8.jar这个文件中,具体在这个j ...

  3. Libsvm学习

        本篇博客转自 http://www.cppblog.com/guijie/archive/2013/09/05/169034.html     在电脑文件夹E:\other\matlab 20 ...

  4. 对cost函数的概率解释

    Likehood函数即似然函数,是概率统计中经常用到的一种函数,其原理网上很容易找到,这里就不讲了.这篇博文主要讲解Likelihood对回归模型的Probabilistic interpretati ...

  5. [hankerrank]Counter game

    https://www.hackerrank.com/contests/w8/challenges/counter-game 关键是二分查找等于或最接近的小于target的数.可以使用和mid+1判断 ...

  6. React-用Jest测试

    一. 1.目录结构 二.代码 1.CheckboxWithLabel.jsx var React = require('react/addons'); var CheckboxWithLabel = ...

  7. python 下划线的使用(转载:安生犹梦 新浪博客)

    Python 用下划线作为变量前缀和后缀指定特殊变量. _xxx      不能用'from module import *'导入 __xxx__ 系统定义名字 __xxx    类中的私有变量名 核 ...

  8. *CentOS下简单的MySQL数据库操作

    1.登录成功之后退出的话,直接输入quit或者exit即可.

  9. CentOS环境下Java开发环境的搭建

    ------------------------------------------------------- 安装Jdk 1.查询系统默认JDK CentOS系统默认会安装JDK,一般建议卸载后安装 ...

  10. class卸载、热替换和Tomcat的热部署的分析

    一 class的热替换 ClassLoader中重要的方法 loadClassClassLoader.loadClass(...) 是ClassLoader的入口点.当一个类没有指明用什么加载器加载的 ...