单机/伪分布式Hadoop2.4.1安装文档 2014-07-08 21:16 2275人阅读 评论(0) 收藏
转载自官方文档,最新版请见:http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/SingleCluster.html
补充:建议添加如下环境变量
#hadoop configuration
export PATH=$PATH:/home/jediael/hadoop-2.4.1/bin:/home/jediael/hadoop-2.4.1/sbin
export HADOOP_HOME=/home/jediael/hadoop-2.4.1
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
Purpose
This document describes how to set up and configure a single-node Hadoop installation so that you can quickly perform simple operations using Hadoop MapReduce and the Hadoop Distributed File System (HDFS).
Prerequisites
Supported Platforms
- GNU/Linux is supported as a development and production platform. Hadoop has been demonstrated on GNU/Linux clusters with 2000 nodes.
- Windows is also a supported platform but the followings steps are for Linux only. To set up Hadoop on Windows, see wiki page.
Required Software
Required software for Linux include:
- Java™ must be installed. Recommended Java versions are described at HadoopJavaVersions.
- ssh must be installed and sshd must be running to use the Hadoop scripts that manage remote Hadoop daemons.
Installing Software
If your cluster doesn't have the requisite software you will need to install it.
For example on Ubuntu Linux:
$ sudo apt-get install ssh
$ sudo apt-get install rsync
Download
To get a Hadoop distribution, download a recent stable release from one of the Apache Download Mirrors.
Prepare to Start the Hadoop Cluster
Unpack the downloaded Hadoop distribution. In the distribution, edit the file etc/hadoop/hadoop-env.sh to define some parameters as follows:
# set to the root of your Java installation
export JAVA_HOME=/usr/java/latest # Assuming your installation directory is /usr/local/hadoop
export HADOOP_PREFIX=/usr/local/hadoop
第二步不做好像没影响。
Try the following command:
$ bin/hadoop
This will display the usage documentation for the hadoop script.
Now you are ready to start your Hadoop cluster in one of the three supported modes:
Standalone Operation
By default, Hadoop is configured to run in a non-distributed mode, as a single Java process. This is useful for debugging.
The following example copies the unpacked conf directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory.
$ mkdir input
$ cp etc/hadoop/*.xml input
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar grep input output 'dfs[a-z.]+'
$ cat output/*
Pseudo-Distributed Operation
Hadoop can also be run on a single-node in a pseudo-distributed mode where each Hadoop daemon runs in a separate Java process.
Configuration
Use the following:
etc/hadoop/core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
etc/hadoop/hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
Setup passphraseless ssh
Now check that you can ssh to the localhost without a passphrase:
$ ssh localhost
If you cannot ssh to localhost without a passphrase, execute the following commands:
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Execution
The following instructions are to run a MapReduce job locally. If you want to execute a job on YARN, see YARN on Single Node.
- Format the filesystem:
$ bin/hdfs namenode -format
- Start NameNode daemon and DataNode daemon:此步骤报很多警告,但不影响执行结果。
$ sbin/start-dfs.sh
The hadoop daemon log output is written to the $HADOOP_LOG_DIR directory (defaults to $HADOOP_HOME/logs).
- Browse the web interface for the NameNode; by default it is available at:
- NameNode - http://localhost:50070/
- Make the HDFS directories required to execute MapReduce jobs:
$ bin/hdfs dfs -mkdir /user
$ bin/hdfs dfs -mkdir /user/<username> - Copy the input files into the distributed filesystem:
$ bin/hdfs dfs -put etc/hadoop input
- Run some of the examples provided:
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar grep input output 'dfs[a-z.]+'
- Examine the output files:
Copy the output files from the distributed filesystem to the local filesystem and examine them:
$ bin/hdfs dfs -get output output
$ cat output/*or
View the output files on the distributed filesystem:
$ bin/hdfs dfs -cat output/*
- When you're done, stop the daemons with:
$ sbin/stop-dfs.sh
YARN on Single Node
You can run a MapReduce job on YARN in a pseudo-distributed mode by setting a few parameters and running ResourceManager daemon and NodeManager daemon in addition.
The following instructions assume that 1. ~ 4. steps of the above instructions are already executed.
- Configure parameters as follows:
etc/hadoop/mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>etc/hadoop/yarn-site.xml:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration> - Start ResourceManager daemon and NodeManager daemon:
$ sbin/start-yarn.sh
- Browse the web interface for the ResourceManager; by default it is available at:
- ResourceManager - http://localhost:8088/
- Run a MapReduce job.
- When you're done, stop the daemons with:
$ sbin/stop-yarn.sh
单机/伪分布式Hadoop2.4.1安装文档 2014-07-08 21:16 2275人阅读 评论(0) 收藏的更多相关文章
- 单机/伪分布式Hadoop2.4.1安装文档
转载自官方文档,最新版请见:http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/SingleCluster.h ...
- opnet安装及安装中出现问题的解决办法 分类: opnet 2014-04-06 21:50 397人阅读 评论(0) 收藏
我使用的opnet14.5 win7 64位系统的http://pan.baidu.com/s/1qWyfxnu,电脑先刷了win7 64位原版系统. 选择了VS2013+opnet14.5的安装方 ...
- CocoaPods安装和使用教程 分类: ios技术 ios相关 2015-03-11 21:53 48人阅读 评论(0) 收藏
目录 CocoaPods是什么? 如何下载和安装CocoaPods? 如何使用CocoaPods? 场景1:利用CocoaPods,在项目中导入AFNetworking类库 场景2:如何正确编译运行一 ...
- VS2010下安装Opencv 分类: Opencv 2014-11-02 13:51 778人阅读 评论(0) 收藏
Opencv作为一种跨平台计算机视觉库,在图像处理领域得到广泛的应用,下面介绍如何在VS2010中安装配置Opencv 一.下载Opencv 下载网址:http://sourceforge.net/p ...
- JAVA代码规范 标签: java文档工作 2016-06-12 21:50 277人阅读 评论(5) 收藏
开始做java的ITOO了,近期的工作内容就是按照代码规范来改自己负责的代码,之前做机房收费系统的时候,也是经常验收的,甚至于我们上次验收的时候,老师也去了.对于我们的代码规范,老师其实是很重视的,他 ...
- 安装hadoop2.6.0伪分布式环境 分类: A1_HADOOP 2015-04-27 18:59 409人阅读 评论(0) 收藏
集群环境搭建请见:http://blog.csdn.net/jediael_lu/article/details/45145767 一.环境准备 1.安装linux.jdk 2.下载hadoop2.6 ...
- 安装spark1.3.1单机环境 分类: B8_SPARK 2015-04-27 14:52 1873人阅读 评论(0) 收藏
本文介绍安装spark单机环境的方法,可用于测试及开发.主要分成以下4部分: (1)环境准备 (2)安装scala (3)安装spark (4)验证安装情况 1.环境准备 (1)配套软件版本要求:Sp ...
- 搭建hadoop2.6.0集群环境 分类: A1_HADOOP 2015-04-20 07:21 459人阅读 评论(0) 收藏
一.规划 (一)硬件资源 10.171.29.191 master 10.171.94.155 slave1 10.251.0.197 slave3 (二)基本资料 用户: jediael 目录: ...
- Hadoop1.2.1伪分布模式安装指南 分类: A1_HADOOP 2014-08-17 10:52 1346人阅读 评论(0) 收藏
一.前置条件 1.操作系统准备 (1)Linux可以用作开发平台及产品平台. (2)win32只可用作开发平台,且需要cygwin的支持. 2.安装jdk 1.6或以上 3.安装ssh,并配置免密码登 ...
随机推荐
- 在web开发中你不得不注意的安全验证问题#2-XSS
前言 XSS又叫CSS (Cross Site Script) ,跨站脚本攻击. 恶意攻击者往Web页面里插入恶意html代码.当用户浏览该页之时,嵌入当中Web里面的html代码会被运行,从而达到恶 ...
- 数据结构(C实现)------- 单链表
在单链表中,每个结点包括两部分:存放每个数据元素本身信息的数据域和存放其直接后继存储位置的指针域. 单链表结点的类型描写叙述: typedef int ElemType; typedef struct ...
- python使用大漠插件进行脚本开发的尝试(一)
关于游戏脚本是纯然的小白,记一下学习过程遇到的问题.是在win10系统下对PC端的游戏进行脚本编辑,不知道会不会半途放弃. 一.大漠插件 大漠插件在游戏脚本编辑过程中是比较常见的工具,按我理解大致做的 ...
- BZOJ2119: 股市的预测(后缀数组)
Description 墨墨的妈妈热爱炒股,她要求墨墨为她编写一个软件,预测某只股票未来的走势.股票折线图是研究股票的必备工 具,它通过一张时间与股票的价位的函数图像清晰地展示了股票的走势情况.经过长 ...
- C# Aspect-Oriented Programming(AOP) 利用多种模式实现动态代理
什么是AOP(Aspect-Oriented Programming)? AOP允许开发者动态地修改静态的OO模型,构造出一个能够不断增长以满足新增需求的系统,就象现实世界中的对象会在其生命周期中不断 ...
- restcontroller和controller区别
http://www.cnblogs.com/softidea/p/5884772.html#undefined http://blog.csdn.net/blueheart20/article/de ...
- 【SSH高速进阶】——struts2简单的实例
近期刚刚入门struts2.这里做一个简单的struts2实例来跟大家一起学习一下. 本例实现最简单的登陆,仅包括两个页面:login.jsp 用来输入username与password:succes ...
- 《Java实战开发经典》第五章5.3
package xiti5; public class Third { public static void main(String[] args) { T t=new T("want yo ...
- 解决Cookie乱码
在Asp.net的HttpCookie中写入汉字,读取值为什么全是乱码?其实这是因 为文字编码而造成的,汉字是两个编码,所以才会搞出这么个乱码出来!其实解决的方法很简单:只要在写入Cookie时,先将 ...
- TreeView 的简单实用
TreeView组件是由多个类来定义的,TreeView组件是由命名空间"System.Windows.Forms"中的"TreeView"类来定义的,而其中的 ...