单机/伪分布式Hadoop2.4.1安装文档 2014-07-08 21:16 2275人阅读 评论(0) 收藏
转载自官方文档,最新版请见:http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/SingleCluster.html
补充:建议添加如下环境变量
#hadoop configuration
export PATH=$PATH:/home/jediael/hadoop-2.4.1/bin:/home/jediael/hadoop-2.4.1/sbin
export HADOOP_HOME=/home/jediael/hadoop-2.4.1
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.
Purpose
This document describes how to set up and configure a single-node Hadoop installation so that you can quickly perform simple operations using Hadoop MapReduce and the Hadoop Distributed File System (HDFS).
Prerequisites
Supported Platforms
- GNU/Linux is supported as a development and production platform. Hadoop has been demonstrated on GNU/Linux clusters with 2000 nodes.
- Windows is also a supported platform but the followings steps are for Linux only. To set up Hadoop on Windows, see wiki page.
Required Software
Required software for Linux include:
- Java™ must be installed. Recommended Java versions are described at HadoopJavaVersions.
- ssh must be installed and sshd must be running to use the Hadoop scripts that manage remote Hadoop daemons.
Installing Software
If your cluster doesn't have the requisite software you will need to install it.
For example on Ubuntu Linux:
$ sudo apt-get install ssh
$ sudo apt-get install rsync
Download
To get a Hadoop distribution, download a recent stable release from one of the Apache Download Mirrors.
Prepare to Start the Hadoop Cluster
Unpack the downloaded Hadoop distribution. In the distribution, edit the file etc/hadoop/hadoop-env.sh to define some parameters as follows:
# set to the root of your Java installation
export JAVA_HOME=/usr/java/latest # Assuming your installation directory is /usr/local/hadoop
export HADOOP_PREFIX=/usr/local/hadoop
第二步不做好像没影响。
Try the following command:
$ bin/hadoop
This will display the usage documentation for the hadoop script.
Now you are ready to start your Hadoop cluster in one of the three supported modes:
Standalone Operation
By default, Hadoop is configured to run in a non-distributed mode, as a single Java process. This is useful for debugging.
The following example copies the unpacked conf directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory.
$ mkdir input
$ cp etc/hadoop/*.xml input
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar grep input output 'dfs[a-z.]+'
$ cat output/*
Pseudo-Distributed Operation
Hadoop can also be run on a single-node in a pseudo-distributed mode where each Hadoop daemon runs in a separate Java process.
Configuration
Use the following:
etc/hadoop/core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
etc/hadoop/hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
Setup passphraseless ssh
Now check that you can ssh to the localhost without a passphrase:
$ ssh localhost
If you cannot ssh to localhost without a passphrase, execute the following commands:
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Execution
The following instructions are to run a MapReduce job locally. If you want to execute a job on YARN, see YARN on Single Node.
- Format the filesystem:
$ bin/hdfs namenode -format
- Start NameNode daemon and DataNode daemon:此步骤报很多警告,但不影响执行结果。
$ sbin/start-dfs.sh
The hadoop daemon log output is written to the $HADOOP_LOG_DIR directory (defaults to $HADOOP_HOME/logs).
- Browse the web interface for the NameNode; by default it is available at:
- NameNode - http://localhost:50070/
- Make the HDFS directories required to execute MapReduce jobs:
$ bin/hdfs dfs -mkdir /user
$ bin/hdfs dfs -mkdir /user/<username> - Copy the input files into the distributed filesystem:
$ bin/hdfs dfs -put etc/hadoop input
- Run some of the examples provided:
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar grep input output 'dfs[a-z.]+'
- Examine the output files:
Copy the output files from the distributed filesystem to the local filesystem and examine them:
$ bin/hdfs dfs -get output output
$ cat output/*or
View the output files on the distributed filesystem:
$ bin/hdfs dfs -cat output/*
- When you're done, stop the daemons with:
$ sbin/stop-dfs.sh
YARN on Single Node
You can run a MapReduce job on YARN in a pseudo-distributed mode by setting a few parameters and running ResourceManager daemon and NodeManager daemon in addition.
The following instructions assume that 1. ~ 4. steps of the above instructions are already executed.
- Configure parameters as follows:
etc/hadoop/mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>etc/hadoop/yarn-site.xml:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration> - Start ResourceManager daemon and NodeManager daemon:
$ sbin/start-yarn.sh
- Browse the web interface for the ResourceManager; by default it is available at:
- ResourceManager - http://localhost:8088/
- Run a MapReduce job.
- When you're done, stop the daemons with:
$ sbin/stop-yarn.sh
单机/伪分布式Hadoop2.4.1安装文档 2014-07-08 21:16 2275人阅读 评论(0) 收藏的更多相关文章
- 单机/伪分布式Hadoop2.4.1安装文档
转载自官方文档,最新版请见:http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/SingleCluster.h ...
- opnet安装及安装中出现问题的解决办法 分类: opnet 2014-04-06 21:50 397人阅读 评论(0) 收藏
我使用的opnet14.5 win7 64位系统的http://pan.baidu.com/s/1qWyfxnu,电脑先刷了win7 64位原版系统. 选择了VS2013+opnet14.5的安装方 ...
- CocoaPods安装和使用教程 分类: ios技术 ios相关 2015-03-11 21:53 48人阅读 评论(0) 收藏
目录 CocoaPods是什么? 如何下载和安装CocoaPods? 如何使用CocoaPods? 场景1:利用CocoaPods,在项目中导入AFNetworking类库 场景2:如何正确编译运行一 ...
- VS2010下安装Opencv 分类: Opencv 2014-11-02 13:51 778人阅读 评论(0) 收藏
Opencv作为一种跨平台计算机视觉库,在图像处理领域得到广泛的应用,下面介绍如何在VS2010中安装配置Opencv 一.下载Opencv 下载网址:http://sourceforge.net/p ...
- JAVA代码规范 标签: java文档工作 2016-06-12 21:50 277人阅读 评论(5) 收藏
开始做java的ITOO了,近期的工作内容就是按照代码规范来改自己负责的代码,之前做机房收费系统的时候,也是经常验收的,甚至于我们上次验收的时候,老师也去了.对于我们的代码规范,老师其实是很重视的,他 ...
- 安装hadoop2.6.0伪分布式环境 分类: A1_HADOOP 2015-04-27 18:59 409人阅读 评论(0) 收藏
集群环境搭建请见:http://blog.csdn.net/jediael_lu/article/details/45145767 一.环境准备 1.安装linux.jdk 2.下载hadoop2.6 ...
- 安装spark1.3.1单机环境 分类: B8_SPARK 2015-04-27 14:52 1873人阅读 评论(0) 收藏
本文介绍安装spark单机环境的方法,可用于测试及开发.主要分成以下4部分: (1)环境准备 (2)安装scala (3)安装spark (4)验证安装情况 1.环境准备 (1)配套软件版本要求:Sp ...
- 搭建hadoop2.6.0集群环境 分类: A1_HADOOP 2015-04-20 07:21 459人阅读 评论(0) 收藏
一.规划 (一)硬件资源 10.171.29.191 master 10.171.94.155 slave1 10.251.0.197 slave3 (二)基本资料 用户: jediael 目录: ...
- Hadoop1.2.1伪分布模式安装指南 分类: A1_HADOOP 2014-08-17 10:52 1346人阅读 评论(0) 收藏
一.前置条件 1.操作系统准备 (1)Linux可以用作开发平台及产品平台. (2)win32只可用作开发平台,且需要cygwin的支持. 2.安装jdk 1.6或以上 3.安装ssh,并配置免密码登 ...
随机推荐
- VPS 上线监控监控脚本
文件地址 https://github.com/yourshell/yisuo-script/blob/master/vpstz/vpsmon.zip https://download.csdn.ne ...
- 华为畅玩5 (CUN-AL00) 刷入第三方twrp Recovery 及 root
华为畅玩5 (CUN-AL00) 刷入第三方twrp Recovery 及 root 下载地址 http://pan.baidu.com/s/1hsn6VzA 1. 在官网申请解锁码 申 ...
- Direct2D 如何关闭抗锯齿
// Each pixel is rendered if its pixel center is contained by the geometry. // D2D1_ANTIALIAS_MODE_A ...
- opera mini 7.5安卓改服版
opera mini 7.5安卓改服版Opera mini 7.5安卓版前两天发布了,试着进行改服实现***,过程跟以前的OPM7.0差不多,大家可参照我之前的博客教程Opera mini7.0改服教 ...
- ASP.NET MVC案例教程(基于ASP.NET MVC beta)——第三篇:ASP.NET MVC全局观
摘要 本文对ASP.NET MVC的全局运行机理进行一个简要的介绍,以使得朋友们更好的理解后续文章. 前言 在上一篇文章中,我们实现了第一个ASP.NET MVC页面.对于没有接触 ...
- Python 极简教程(十一)字典 dict
字典是以大括号标识,以键值对(key:value)的形式,无序,不可重复,可变的集合类型. 字典具有非常高效的读写效率. >>> d = {} # 创建一个空字典 >>& ...
- socket长连接的维持
长连接的维持,是要客户端程序,定时向服务端程序,发送一个维持连接包的.如果,长时间未发送维持连接包,服务端程序将断开连接. 客户端:通过持有Client对象,可以随时(使用sendObject方法)发 ...
- angular 响应式自定义表单控件—注册头像实例
1. 组件继承ControlValueAccessor,ControlValueAccessor接口需要实现三个必选方法 writeValue() 用于向元素中写入值,获取表单的元素的元素值 regi ...
- JS学习笔记 - 透明度运动框
该练习的笔记如下: 1. var cur=0; //先声明一个变量. 2. parseInt会舍掉小数,而opacity值恰恰是小数,所以对于opacity,必须用parseFloat. cur ...
- SpringMVC响应Ajax请求(@Responsebody注解返回页面)
项目需求描述:page1中的ajax请求Controller,Controller负责将service返回的数据填充到page2中,并将page2整个页面返回到page1中ajax的回调函数. 一句话 ...