安装hadoop1.2.1集群环境
一、规划
(一)硬件资源
10.171.29.191 master
10.173.54.84 slave1
10.171.114.223 slave2
(二)基本资料
用户: jediael
目录:/opt/jediael/
二、环境配置
(一)统一用户名密码,并为jediael赋予执行所有命令的权限
#passwd
# useradd jediael
# passwd jediael
# vi /etc/sudoers
增加以下一行:
jediael ALL=(ALL) ALL
(二)创建目录/opt/jediael
$sudo chown jediael:jediael /opt
$ cd /opt
$ sudo mkdir jediael
注意:/opt必须是jediael的,否则会在format namenode时出错。
(三)修改用户名及/etc/hosts文件
1、修改/etc/sysconfig/network
NETWORKING=yes
HOSTNAME=*******
2、修改/etc/hosts
10.171.29.191 master
10.173.54.84 slave1
10.171.114.223 slave2
注 意hosts文件不能有127.0.0.1 *****配置,否则会导致出现异常。org.apache.hadoop.ipc.Client: Retrying connect to server: master/10.171.29.191:9000. Already trie
3、hostname命令
hostname ****
(四)配置免密码登录
以上命令在master上使用jediael用户执行:
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
然后,将authorized_keys复制到slave1,slave2
scp ~/.ssh/authorized_keys slave1:~/.ssh/
scp ~/.ssh/authorized_keys slave2:~/.ssh/
注意
(1)若提示.ssh目录不存在,则表示此机器从未运行过ssh,因此运行一次即可创建.ssh目录。
(2).ssh/的权限为600,authorized_keys的权限为700,权限大了小了都不行。
(五)在3台机器上分别安装java,并设置相关环境变量
参考http://blog.csdn.net/jediael_lu/article/details/38925871
(六)下载hadoop-1.2.1.tar.gz,并将其解压到/opt/jediael
三、修改配置文件
【3台机器上均要执行】
(一)修改conf/hadoop_env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_51
(二)修改core-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property> <property>
<name>hadoop.tmp.dir</name>
<value>/opt/tmphadoop</value>
</property>
(三)修改hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
(四)修改mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>master:9001</value>
</property>
(五)修改master及slaves
master:
master slaves:
slave1
slave2
可以在master中完成上述配置,然后使用scp命令复制到slave1与slave2上。
如:
$scp core-site.xml slave2:/opt/jediael/hadoop-1.2.1/conf
四、启动并验证
1、格式 化namenode【此步骤在3台机器上均要运行】
[jediael@master hadoop-1.2.1]$ bin/hadoop namenode -format
15/01/21 15:13:40 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master/10.171.29.191
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 1.2.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG: java = 1.7.0_51
************************************************************/
Re-format filesystem in /opt/tmphadoop/dfs/name ? (Y or N) Y
15/01/21 15:13:43 INFO util.GSet: Computing capacity for map BlocksMap
15/01/21 15:13:43 INFO util.GSet: VM type = 64-bit
15/01/21 15:13:43 INFO util.GSet: 2.0% max memory = 1013645312
15/01/21 15:13:43 INFO util.GSet: capacity = 2^21 = 2097152 entries
15/01/21 15:13:43 INFO util.GSet: recommended=2097152, actual=2097152
15/01/21 15:13:43 INFO namenode.FSNamesystem: fsOwner=jediael
15/01/21 15:13:43 INFO namenode.FSNamesystem: supergroup=supergroup
15/01/21 15:13:43 INFO namenode.FSNamesystem: isPermissionEnabled=true
15/01/21 15:13:43 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
15/01/21 15:13:43 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
15/01/21 15:13:43 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0
15/01/21 15:13:43 INFO namenode.NameNode: Caching file names occuring more than 10 times
15/01/21 15:13:44 INFO common.Storage: Image file /opt/tmphadoop/dfs/name/current/fsimage of size 113 bytes saved in 0 seconds.
15/01/21 15:13:44 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/opt/tmphadoop/dfs/name/current/edits
15/01/21 15:13:44 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/opt/tmphadoop/dfs/name/current/edits
15/01/21 15:13:44 INFO common.Storage: Storage directory /opt/tmphadoop/dfs/name has been successfully formatted.
15/01/21 15:13:44 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/10.171.29.191
************************************************************/
2、启动hadoop【此步骤只需要在master上执行】
[jediael@master hadoop-1.2.1]$ bin/start-all.sh
starting namenode, logging to /opt/jediael/hadoop-1.2.1/libexec/../logs/hadoop-jediael-namenode-master.out
slave1: starting datanode, logging to /opt/jediael/hadoop-1.2.1/libexec/../logs/hadoop-jediael-datanode-slave1.out
slave2: starting datanode, logging to /opt/jediael/hadoop-1.2.1/libexec/../logs/hadoop-jediael-datanode-slave2.out
master: starting secondarynamenode, logging to /opt/jediael/hadoop-1.2.1/libexec/../logs/hadoop-jediael-secondarynamenode-master.out
starting jobtracker, logging to /opt/jediael/hadoop-1.2.1/libexec/../logs/hadoop-jediael-jobtracker-master.out
slave1: starting tasktracker, logging to /opt/jediael/hadoop-1.2.1/libexec/../logs/hadoop-jediael-tasktracker-slave1.out
slave2: starting tasktracker, logging to /opt/jediael/hadoop-1.2.1/libexec/../logs/hadoop-jediael-tasktracker-slave2.out
3、登录页面验证
NameNode http://ip:50070
JobTracker http://ip50030
4、查看各个主机的java进程
(1)master:
$ jps
17963 NameNode
18280 JobTracker
18446 Jps
18171 SecondaryNameNode
(2)slave1:
$ jps
16019 Jps
15858 DataNode
15954 TaskTracker
(3)slave2:
$ jps
15625 Jps
15465 DataNode
15561 TaskTracker
五、运行一个完整的mapreduce程序。
以下内容均只是master上执行
1、将wordcount.jar包复制至服务器上
程序见http://blog.csdn.net/jediael_lu/article/details/37596469
2、创建输入目录,并将相关文件复制至目录
[jediael@master166 ~]$ hadoop fs -mkdir /wcin
[jediael@master166 projects]$ hadoop fs -copyFromLocal /opt/jediael/hadoop-1.2.1/conf/hdfs-site.xml /wcin
3、运行程序
[jediael@master166 projects]$ hadoop jar wordcount.jar org.jediael.hadoopdemo.wordcount.WordCount /wcin /wcout
14/08/31 20:04:26 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
14/08/31 20:04:26 INFO input.FileInputFormat: Total input paths to process : 1
14/08/31 20:04:26 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/08/31 20:04:26 WARN snappy.LoadSnappy: Snappy native library not loaded
14/08/31 20:04:26 INFO mapred.JobClient: Running job: job_201408311554_0003
14/08/31 20:04:27 INFO mapred.JobClient: map 0% reduce 0%
14/08/31 20:04:31 INFO mapred.JobClient: map 100% reduce 0%
14/08/31 20:04:40 INFO mapred.JobClient: map 100% reduce 100%
14/08/31 20:04:40 INFO mapred.JobClient: Job complete: job_201408311554_0003
14/08/31 20:04:40 INFO mapred.JobClient: Counters: 29
14/08/31 20:04:40 INFO mapred.JobClient: Job Counters
14/08/31 20:04:40 INFO mapred.JobClient: Launched reduce tasks=1
14/08/31 20:04:40 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=4230
14/08/31 20:04:40 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
14/08/31 20:04:40 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
14/08/31 20:04:40 INFO mapred.JobClient: Launched map tasks=1
14/08/31 20:04:40 INFO mapred.JobClient: Data-local map tasks=1
14/08/31 20:04:40 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=8531
14/08/31 20:04:40 INFO mapred.JobClient: File Output Format Counters
14/08/31 20:04:40 INFO mapred.JobClient: Bytes Written=284
14/08/31 20:04:40 INFO mapred.JobClient: FileSystemCounters
14/08/31 20:04:40 INFO mapred.JobClient: FILE_BYTES_READ=370
14/08/31 20:04:40 INFO mapred.JobClient: HDFS_BYTES_READ=357
14/08/31 20:04:40 INFO mapred.JobClient: FILE_BYTES_WRITTEN=104958
14/08/31 20:04:40 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=284
14/08/31 20:04:40 INFO mapred.JobClient: File Input Format Counters
14/08/31 20:04:40 INFO mapred.JobClient: Bytes Read=252
14/08/31 20:04:40 INFO mapred.JobClient: Map-Reduce Framework
14/08/31 20:04:40 INFO mapred.JobClient: Map output materialized bytes=370
14/08/31 20:04:40 INFO mapred.JobClient: Map input records=11
14/08/31 20:04:40 INFO mapred.JobClient: Reduce shuffle bytes=370
14/08/31 20:04:40 INFO mapred.JobClient: Spilled Records=40
14/08/31 20:04:40 INFO mapred.JobClient: Map output bytes=324
14/08/31 20:04:40 INFO mapred.JobClient: Total committed heap usage (bytes)=238026752
14/08/31 20:04:40 INFO mapred.JobClient: CPU time spent (ms)=1130
14/08/31 20:04:40 INFO mapred.JobClient: Combine input records=0
14/08/31 20:04:40 INFO mapred.JobClient: SPLIT_RAW_BYTES=105
14/08/31 20:04:40 INFO mapred.JobClient: Reduce input records=20
14/08/31 20:04:40 INFO mapred.JobClient: Reduce input groups=20
14/08/31 20:04:40 INFO mapred.JobClient: Combine output records=0
14/08/31 20:04:40 INFO mapred.JobClient: Physical memory (bytes) snapshot=289288192
14/08/31 20:04:40 INFO mapred.JobClient: Reduce output records=20
14/08/31 20:04:40 INFO mapred.JobClient: Virtual memory (bytes) snapshot=1533636608
14/08/31 20:04:40 INFO mapred.JobClient: Map output records=20
4、查看结果
[jediael@master166 projects]$ hadoop fs -cat /wcout/*
--> 1
<!-- 1
</configuration> 1
</property> 1
<?xml 1
<?xml-stylesheet 1
<configuration> 1
<name>dfs.replication</name> 1
<property> 1
<value>2</value> 1
Put 1
file. 1
href="configuration.xsl"?> 1
in 1
overrides 1
property 1
site-specific 1
this 1
type="text/xsl" 1
version="1.0"?> 1
cat: File does not exist: /wcout/_logs
安装hadoop1.2.1集群环境的更多相关文章
- 安装hadoop1.2.1集群环境 分类: A1_HADOOP 2014-08-29 15:49 1444人阅读 评论(0) 收藏
一.规划 (一)硬件资源 10.171.29.191 master 10.173.54.84 slave1 10.171.114.223 slave2 (二)基本资料 用户: jediael 目录 ...
- 【Nutch2.3基础教程】集成Nutch/Hadoop/Hbase/Solr构建搜索引擎:安装及运行【集群环境】
1.下载相关软件,并解压 版本号如下: (1)apache-nutch-2.3 (2) hadoop-1.2.1 (3)hbase-0.92.1 (4)solr-4.9.0 并解压至/opt/jedi ...
- (2)虚拟机下hadoop1.1.2集群环境搭建
hadoop集群环境的搭建和单机版的搭建差点儿相同,就是多了一些文件的配置操作. 一.3台主机的hostname改动和IP地址绑定 注意:以下的操作我都是使用root权限进行! (1)3太主机的基本网 ...
- Hadoop化繁为简-从安装Linux到搭建集群环境
简介与环境准备 hadoop的核心是分布式文件系统HDFS以及批处理计算MapReduce.近年,随着大数据.云计算.物联网的兴起,也极大的吸引了我的兴趣,看了网上很多文章,感觉还是云里雾里,很多不必 ...
- Hadoop化繁为简(一)-从安装Linux到搭建集群环境
简介与环境准备 hadoop的核心是分布式文件系统HDFS以及批处理计算MapReduce.近年,随着大数据.云计算.物联网的兴起,也极大的吸引了我的兴趣,看了网上很多文章,感觉还是云里雾里,很多不必 ...
- CAS Client集群环境的Session问题及解决方案介绍,下篇介绍作者本人项目中的解决方案代码
CAS Client集群环境的Session问题及解决方案 程序猿讲故事 2016-05-20 原文 [原创申明:文章为原创,欢迎非盈利性转载,但转载必须注明来源] 之前写过一篇文章,介绍单点登 ...
- 在Hadoop1.2.1分布式集群环境下安装hive0.12
在Hadoop1.2.1分布式集群环境下安装hive0.12 ● 前言: 1. 大家最好通读一遍过后,在理解的基础上再按照步骤搭建. 2. 之前写过两篇<<在VMware下安装Ubuntu ...
- Hadoop集群环境安装
转载请标明出处: http://blog.csdn.net/zwto1/article/details/45647643: 本文出自:[zhang_way的博客专栏] 工具: 虚拟机virtual ...
- Ubuntu 下 Neo4j单机安装和集群环境安装
1. Neo4j简介 Neo4j是一个用Java实现的.高性能的.NoSQL图形数据库.Neo4j 使用图(graph)相关的概念来描述数据模型,通过图中的节点和节点的关系来建模.Neo4j完全兼容A ...
随机推荐
- 表单验证提交——submit与button
之前做东西接触过表单验证提交,但是都是为了完成工作,做完就做完了,没有注过表单验证提交有几种方法,各方法都有啥区别.今天瞎折腾了一下,对他们研究了一下,如下是我个人的理解: submit: 从字面上看 ...
- jquery1.9学习笔记 之选择器(基本元素五)
多种元素选择器 jQuery("selector1,selector2,selectorN") 例子: <!doctype html> <html lang=' ...
- JDBC开发模式
一]代码模块———Demo.java public class Demo { private static Connection connection; private static Statemen ...
- VS2010安装与测试编译问题(fatal error LNK1123: failure during conversion to COFF: file invalid or corrupt)
由于第三方库的各种原因,与编译冲突问题,公司又决定把整个项目都统一改用VS2010来编译.所以我把我开发机上的VS2008卸载了,又重新安装了VS2010.无奈出现了COFF格式转换问题.搜索了下.完 ...
- 股票市场问题(The Stock Market Problem)
Question: Let us suppose we have an array whose ith element gives the price of a share on the day i. ...
- apache2 httpd 基于域名的虚拟主机配置 for centos6X 和debian-8
全系统虚拟主机: for debian 系统的apache2 域名 虚拟主机
- Vericant维立克 | 氪加
Vericant维立克 | 氪加 Vericant维立克
- c语言for语句
首先呢 for语句是由4部分组成 for(表达式1;表达式2;表达式3) 循环体: 注意 1:循环中的表达式用;隔开 表达式1通常用来呢赋初值 表达式2通常用来循环控制也就是循环条件 表达式3通常就是 ...
- c指针点滴1
#include <stdio.h> #include <stdlib.h> void main() { ; int *p = #//&num是一个地址 ...
- Android 读取手机某个文件夹目录及子文件夹中所有的txt文件
1. activity_main.xml文件 <LinearLayout xmlns:android="http://schemas.android.com/apk/res/andro ...