NameNode & DataNode

NameNode类位于org.apache.hadoop.hdfs.server.namenode包下。

NameNode serves as both directory namespace manager and "inode table" for the Hadoop DFS. There is a single NameNode running in any DFS deployment. (Well, except when there is a second backup/failover NameNode.)

The NameNode controls two critical tables:
1) filename->blocksequence (namespace)
2) block->machinelist ("inodes")

The first table is stored on disk and is very precious. The second table is rebuilt every time the NameNode comes up.

'NameNode' refers to both this class as well as the 'NameNode server'. The 'FSNamesystem' class actually performs most of the filesystem management. The majority of the 'NameNode' class itself is concerned with exposing the IPC interface and the http server to the outside world, plus some configuration management.

NameNode implements the ClientProtocol interface, which allows clients to ask for DFS services. ClientProtocol is not designed for direct use by authors of DFS client code. End-users should instead use the org.apache.nutch.hadoop.fs.FileSystem class.

NameNode also implements the DatanodeProtocol interface, used by DataNode programs that actually store DFS data blocks. These methods are invoked repeatedly and automatically by all the DataNodes in a DFS deployment.

NameNode also implements the NamenodeProtocol interface, used by secondary namenodes or rebalancing processes to get partial namenode's state, for example partial blocksMap etc.

DataNode 类位于org.apache.hadoop.hdfs.server.datanode包下。
DataNode is a class (and program) that stores a set of blocks for a DFS deployment. A single deployment can have one or many DataNodes. Each DataNode communicates regularly with a single NameNode. It also communicates with client code and other DataNodes from time to time.

DataNodes store a series of named blocks. The DataNode allows client code to read these blocks, or to write new block data. The DataNode may also, in response to instructions from its NameNode, delete blocks or copy blocks to/from other DataNodes.

The DataNode maintains just one critical table:
block-> stream of bytes (of BLOCK_SIZE or less)

This info is stored on a local disk. The DataNode reports the table's contents to the NameNode upon startup and every so often afterwards.

DataNodes spend their lives in an endless loop of asking the NameNode for something to do. A NameNode cannot connect to a DataNode directly; a NameNode simply returns values from functions invoked by a DataNode.

DataNodes maintain an open server socket so that client code or other DataNodes can read/write data. The host/port for this server is reported to the NameNode, which then sends that information to clients or other DataNodes that might be interested.

查找工程里的类或者是资源文件：Ctrl + Shift + R。
查找jar包里的类：Ctrl + Shift + T。

NameNode & DataNode的更多相关文章

HDFS Namenode&Datanode
HDFS Namenode&Datanode HDFS 机制粗略示意图客户端写入文件流程: NN && DN Namenode(NN)工作机制 NN是整个文件系统的管理节点. ...
hadoop stop-dfs.sh 无法停止 namenode datanode
原因: HADOOP_PID_DIR 默认为 /tmp 目录,如果长期不访问/tmp/目录下的文件,文件会被自动清理,因此 stop-dfs.sh 无法根据 pid 停止 namenode, data ...
Hadoop学习笔记（老版本，YARN之前），MapReduce任务Namenode DataNode Jobtracker Tasktracker之间的关系
一.基本概念在MapReduce中,一个准备提交执行的应用程序称为“作业(job)”,而从一个作业划分出的运行于各个计算节点的工作单元称为“任务(task)”.此外,Hadoop提供的分布式文件系统 ...
hdfs namenode/datanode工作机制
一. namenode工作机制 1. 客户端上传文件时,namenode先检查有没有同名的文件,如果有,则直接返回错误信息.如果没有,则根据要上传文件的大小以及block的大小,算出需要分成几个blo ...
namenode datanode理解
HDFS是以NameNode和DataNode管理者和工作者模式运行的. NameNode管理着整个HDFS文件系统的元数据.从架构设计上看,元数据大致分成两个层次:Name ...
【Hadoop】hdfs的秘密，namenode,datanode,yarn,安全模式，fsimage,edits...
1.bin/hdfs namenode -format ** 注意事项 1.在配置好了配置文件之后,首次启动之前,做初始化操作 2.在后续启动的时候,不需要再初始化 3.初始化的一些影响一.初始化操 ...
[Hadoop异常处理] Namenode和Datanode都正常启动,但是web页面不显示
异常 namenode和data都正常启动但是web页面却不显示,都为零解决办法一: 在hdfs-site.xml配置文件中,加入 <property> <name>dfs ...
HDFS体系结构(NameNode、DataNode详解)
hadoop项目地址:http://hadoop.apache.org/ NameNode.DataNode详解 (一)分布式文件系统概述数据量越来越多,在一个操作系统管辖的范围存不下了,那么就分配 ...
datanode与namenode的通信
在分析DataNode时, 因为DataNode上保存的是数据块, 因此DataNode主要是对数据块进行操作. A. DataNode的主要工作流程1. 客户端和DataNode的通信: 客户端向D ...

随机推荐

通过ssh tunnel连接内网ECS和RDS
通过ssh tunnel连接内网ECS和RDS 这里讲了ssh tunnel的原理.很清晰. 此后又给外网访问内网增加了一种思路.感觉特别棒. 拓宽了思路:
将packages/apps/下的app导入eclipse
当刚接触android自带的一个模块时,如何去熟悉它?相信不少人第一步都会尝试着去了解其内容的调用流程,而此时若能够单步调试则显得非常重要了,于是有了文章标题所说的尝试. 作者这里要导入的是Setti ...
tomcat 新手上路
前提:本机先安装好JDK,保证常规java环境已经具备 1.下载Tomcat 7.0现在官网上好象已经没有安装程序版了,只有免解压zip版本(现在最新的版本是7.0.42) 下载地址 http://t ...
发布了Android的App，我要开源几个组件！
做了一款App,本来是毕业设计但是毕业的时候还没有做完,因为大部分时间都改论文去了,你们都懂的.现在毕业了在工作之余把App基本上做完了.为什么说基本上呢,因为我觉得还有很多功能还没实现,还要很多bu ...
Linux下who命令之C语言实现
Linux下who命令之C语言实现 Step1:前期准备首先要有一个清楚的认识:linux中一切皆文件实现who命令,who命令也是Linux中的一个文件,那我们怎么找到它呢?我们可以" ...
sql 几点记录
1 With子句 1.1 学习目标掌握with子句用法,并且了解with子句能够提高查询效率的原因. 1.2 With子句要点 with子句的返回结果存到用户的临时表 ...
Mecanim动画模型规范
面数控制, 以三角面计算不要超过4边的面光滑组,法线单位CM,单位比例中心点 3DMax:Reset Transform Maya:Freeze Transformation 帧率:30帧不 ...
js如何判断一个数组
typeof [] 为一个"object" 不能通过此方法判断一个数组方法 1.instanceof方法,这个方法用的比较多. 2.这个是es5以后推荐的方法,Object.pr ...
matlab 画图中线型及颜色设置
matlab受到控制界广泛接受的一个重要原因是因为它提供了方便的绘图功能.本章主要介绍2维图形对象的生成函数及图形控制函数的使用方法,还将简单地介绍一些图形的修饰与标注函数及操作和控制MATLA ...
利用ZTree链接数据库实现 [权限管理]
最近想研究权限管理,看群里有人发了ZTrees模板,我看了下,觉得笔easyUI操作起来更灵活些,于是就开始研究了. 刚开始从网上找了找了个Demo,当然这个并没有实现权限啥的,但实现了前台调用Aja ...

NameNode & DataNode

NameNode & DataNode的更多相关文章

随机推荐

热门专题