原文链接:http://blog.madhukaraphatak.com/secondary-namenode---what-it-really-do/

Secondary Namenode is one of the poorly named component in Hadoop. By its name, it gives a sense that its a backup for the Namenode.But in reality its not. Lot of beginners in Hadoop get confused about what exactly SecondaryNamenode does and why its present in HDFS.So in this blog post I try to explain the role of secondary namenode in HDFS.

By its name, you may assume that it has something to do with Namenode and you are right. So before we dig into Secondary Namenode lets see what exactly Namenode does.

Namenode

Namenode holds the meta data for the HDFS like Namespace information, block information etc. When in use, all this information is stored in main memory. But these information also stored in disk for persistence storage.

The above image shows how Name Node stores information in disk.
Two different files are

  1. fsimage - Its the snapshot of the filesystem when namenode started
  2. Edit logs - Its the sequence of changes made to the filesystem after namenode started

Only in the restart of namenode , edit logs are applied to fsimage to get the latest snapshot of the file system. But namenode restart are rare in production clusters which means edit logs can grow very large for the clusters where namenode runs for a long period of time. The following issues we will encounter in this situation.

  1. Editlog become very large , which will be challenging to manage it
  2. Namenode restart takes long time because lot of changes has to be merged
  3. In the case of crash, we will lost huge amount of metadata since fsimage is very old

So to overcome this issues we need a mechanism which will help us reduce the edit log size which is manageable and have up to date fsimage ,so that load on namenode reduces . It’s very similar to Windows Restore point, which will allow us to take snapshot of the OS so that if something goes wrong , we can fallback to the last restore point.

So now we understood NameNode functionality and challenges to keep the meta data up to date.So what is this all have to with Seconadary Namenode?

Secondary Namenode

Secondary Namenode helps to overcome the above issues by taking over responsibility of merging editlogs with fsimage from the namenode.

The above figure shows the working of Secondary Namenode

  1. It gets the edit logs from the namenode in regular intervals and applies to fsimage
  2. Once it has new fsimage, it copies back to namenode
  3. Namenode will use this fsimage for the next restart,which will reduce the startup time

Secondary Namenode whole purpose is to have a checkpoint in HDFS. Its just a helper node for namenode.That’s why it also known as checkpoint node inside the community.

So we now understood all Secondary Namenode does puts a checkpoint in filesystem which will help Namenode to function better. Its not the replacement or backup for the Namenode. So from now on make a habit of calling it as a checkpoint node.

Secondary Namenode - What it really do?的更多相关文章

  1. Secondary NameNode:的作用?

    前言 最近刚接触Hadoop, 一直没有弄明白NameNode和Secondary NameNode的区别和关系.很多人都认为,Secondary NameNode是NameNode的备份,是为了防止 ...

  2. Hadoop之Secondary NameNode

    NameNode存储文件系统的变化作为log追加在本地的一个文件里:这个文件是edits.当一个NameNode启动时,它从一个映像文件:FsImage,读取HDFS的状态,使用来自edits日志文件 ...

  3. 解读Secondary NameNode的功能

    1.概述 最近有朋友问我Secondary NameNode的作用,是不是NameNode的备份?是不是为了防止NameNode的单点问题?确实,刚接触Hadoop,从字面上看,很容易会把Second ...

  4. Secondary NameNode 的作用

    https://blog.csdn.net/xh16319/article/details/31375197 很多人都认为,Secondary NameNode是NameNode的备份,是为了防止Na ...

  5. (转)Secondary NameNode的作用

    在Hadoop中,有一些命名不好的模块,Secondary NameNode是其中之一.从它的名字上看,它给人的感觉就像是NameNode的备份.但它实际上却不是.很多Hadoop的初学者都很疑惑,S ...

  6. 010 secondary namenode(同步元数据和日志)

    1.格式化 首先格式化之后只剩下一个根目录. 格式化后会出现元数据 集群启动之后,元数据放在内存中的(消耗内存中) 格式化后会产生镜像文件fsimage,元数据存储 启动的时候namenode会读取镜 ...

  7. Secondary NameNode究竟是做什么的

    Secondary NameNode:它究竟有什么作用? 在hadoop中,有一些命名不好的模块,Secondary NameNode是其中之一.从它的名字上看,它给人的感觉就像是NameNode的备 ...

  8. Hadoop- NameNode和Secondary NameNode元数据管理机制

    元数据的存储机制 A.内存中有一份完整的元数据(内存meta data) B.磁盘有一个“准完整”的元数据镜像(fsimage)文件(在namenode的工作目录中) C.用于衔接内存metadata ...

  9. NameNode && Secondary NameNode工作机制

    NameNode && Secondary NameNode工作机制 1)工作流程 2)  fsimage和edits NameNode是HDFS的大脑,它维护着整个文件系统的目录树, ...

随机推荐

  1. 手机新闻网站,掌上移动新闻,手机报client,jQuery Mobile手机新闻网站,手机新闻网站demo,新闻阅读器开发

    我们坐在地铁,经常来查看新浪手机新闻,腾讯新闻.或者刷微信看新闻更多功能.你有没有想过如何实现这些目标.移动互联网,更活泼. 因为HTML5到,jQuery Moblie到.今天我用jqm为了给你一个 ...

  2. TodoList开发笔记 – Part Ⅲ

    本节开始对TodoList项目的客户端进行开发 一.初步了解JQuery 其实我在学校时有接触过一段时间的Web开发,虽然代码量不多也不复杂,但也已经感受到了各浏览器对Web各项标准的恶意,Web界对 ...

  3. SpringMVC全注解

    SpringMVC全注解不是你们那么玩的 前言:忙了段时间,忙得要死要活,累了一段时间,累得死去活来. 偶尔看到很多零注解配置SpringMVC,其实没有根本的零注解. 1)工程图一张: web.xm ...

  4. QMVC

    高性能.NET MVC之QMVC! ASP.NET!这个词代表者一个单词Fat!因为他总是捆绑着太多的太多的类,太多太多的各种功能!你也许会用到,如果你反编译或阅读他们开源的源码,你会不会犹如在大海中 ...

  5. ASP.NET SignalR 2.0入门指南

    ASP.NET SignalR 2.0入门指南 介绍SignalR ASP.NET SignalR 是一个为 ASP.NET 开发人员的库,简化了将实时 web 功能添加到应用程序的过程.实时Web功 ...

  6. Kindergarten Counting Game - UVa494

    欢迎访问我的新博客:http://www.milkcu.com/blog/ 原文地址:http://www.milkcu.com/blog/archives/uva494.html 题目描述  Kin ...

  7. rabbitmq-message(C#)

    1.安装Erlang Windows Binary File 2.安装rabbitmq-server(windows)rabbitmq-server-3.5.4.exe 参考:http://www.r ...

  8. poj3083走玉米地问题

    走玉米地迷宫,一般有两种简单策略,遇到岔路总是优先沿着自己的左手方向,或者右手方向走.给一个迷宫,给出这两种策略的步数,再给出最短路径的长度. ######### #.#.#.#.# S....... ...

  9. MVC4 + WebAPI + EasyUI + Knockout-授权代码维护

    我的权限系统设计实现MVC4 + WebAPI + EasyUI + Knockout(四)授权代码维护 一.前言 权限系统设计中,授权代码是用来控制数据访问权限的.授权代码说白了只是一树型结构的数据 ...

  10. ${pageContext.request.contextPath}的作用

    刚开始不知道是怎么回事,在网上也查找了一些资料,看了还是晕. 看了另一个大侠的,终于有了点眉目. 那位大侠在博客中这样写道“然后在网上找,更让我郁闷的事,TMD!网上“抄袭”的真多啊!而且扯了一大堆! ...