Data - Hadoop单机配置 - 使用Hadoop2.8.0和Ubuntu16.04

系统版本

anliven@Ubuntu1604:~$ uname -a

Linux Ubuntu1604 4.8.0-36-generic #36~16.04.1-Ubuntu SMP Sun Feb 5 09:39:57 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

anliven@Ubuntu1604:~$

anliven@Ubuntu1604:~$ cat /proc/version

Linux version 4.8.0-36-generic (buildd@lgw01-18) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #36~16.04.1-Ubuntu SMP Sun Feb 5 09:39:57 UTC 2017

anliven@Ubuntu1604:~$

anliven@Ubuntu1604:~$ lsb_release -a

No LSB modules are available.

Distributor ID:	Ubuntu

Description:	Ubuntu 16.04.2 LTS

Release:	16.04

Codename:	xenial

anliven@Ubuntu1604:~$

创建hadoop用户

anliven@Ubuntu1604:~$ sudo useradd -m hadoop -s /bin/bash

anliven@Ubuntu1604:~$ sudo passwd hadoop

输入新的 UNIX 密码：

重新输入新的 UNIX 密码：

passwd：已成功更新密码

anliven@Ubuntu1604:~$

anliven@Ubuntu1604:~$ sudo adduser hadoop sudo

正在添加用户"hadoop"到"sudo"组...

正在将用户“hadoop”加入到“sudo”组中

完成。

anliven@Ubuntu1604:~$

更新apt及安装vim

hadoop@Ubuntu1604:~$ sudo apt-get update

命中:1 http://mirrors.aliyun.com/ubuntu xenial InRelease

命中:2 http://mirrors.aliyun.com/ubuntu xenial-updates InRelease

命中:3 http://mirrors.aliyun.com/ubuntu xenial-backports InRelease

命中:4 http://mirrors.aliyun.com/ubuntu xenial-security InRelease

正在读取软件包列表... 完成

hadoop@Ubuntu1604:~$

hadoop@Ubuntu1604:~$ sudo apt-get install vim

正在读取软件包列表... 完成

正在分析软件包的依赖关系树

正在读取状态信息... 完成

vim 已经是最新版 (2:7.4.1689-3ubuntu1.2)。

升级了 0 个软件包，新安装了 0 个软件包，要卸载 0 个软件包，有 50 个软件包未被升级。

hadoop@Ubuntu1604:~$

配置SSH免密码登录

hadoop@Ubuntu1604:~$ sudo apt-get install openssh-server

正在读取软件包列表... 完成

正在分析软件包的依赖关系树

正在读取状态信息... 完成

openssh-server 已经是最新版 (1:7.2p2-4ubuntu2.1)。

升级了 0 个软件包，新安装了 0 个软件包，要卸载 0 个软件包，有 50 个软件包未被升级。

hadoop@Ubuntu1604:~$

hadoop@Ubuntu1604:~$ cd ~

hadoop@Ubuntu1604:~$ mkdir .ssh

hadoop@Ubuntu1604:~$ cd .ssh

hadoop@Ubuntu1604:~/.ssh$ ssh-keygen -t rsa

Generating public/private rsa key pair.

Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /home/hadoop/.ssh/id_rsa.

Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.

The key fingerprint is:

SHA256:DzjVWgTQB5I1JGRBmWi6gVHJ03V4WnJZEdojtbou0DM hadoop@Ubuntu1604

The key's randomart image is:

+---[RSA 2048]----+

| o.o =X@B=*o     |

|. + +.*+*B..     |

| o +   *+.*      |

|. o   .o = .     |

|   o .o S        |

|  . . E. +       |

|     . o. .      |

|      ..         |

|       ..        |

+----[SHA256]-----+

hadoop@Ubuntu1604:~/.ssh$

hadoop@Ubuntu1604:~/.ssh$ cat id_rsa.pub >> authorized_keys

hadoop@Ubuntu1604:~/.ssh$ ls -l

总用量 12

-rw-rw-r-- 1 hadoop hadoop  399 4月  27 07:33 authorized_keys

-rw------- 1 hadoop hadoop 1679 4月  27 07:32 id_rsa

-rw-r--r-- 1 hadoop hadoop  399 4月  27 07:32 id_rsa.pub

hadoop@Ubuntu1604:~/.ssh$

hadoop@Ubuntu1604:~/.ssh$ cd

hadoop@Ubuntu1604:~$

hadoop@Ubuntu1604:~$ ssh localhost

The authenticity of host 'localhost (127.0.0.1)' can't be established.

ECDSA key fingerprint is SHA256:fZ7fAvnnFk0/Imkn0YPdc2Gzxnfr0IJGSRb1swbm7oU.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.

Welcome to Ubuntu 16.04.2 LTS (GNU/Linux 4.8.0-36-generic x86_64)

 * Documentation:  https://help.ubuntu.com

 * Management:     https://landscape.canonical.com

 * Support:        https://ubuntu.com/advantage

44 个可升级软件包。

0 个安全更新。

*** 需要重启系统 ***

Last login: Thu Apr 27 07:25:26 2017 from 192.168.16.1

hadoop@Ubuntu1604:~$

hadoop@Ubuntu1604:~$ exit

注销

Connection to localhost closed.

hadoop@Ubuntu1604:~$

安装Java

hadoop@Ubuntu1604:~$ dpkg -l |grep jdk

hadoop@Ubuntu1604:~$

hadoop@Ubuntu1604:~$ sudo apt-get install openjdk-8-jre openjdk-8-jdk

正在读取软件包列表... 完成

正在分析软件包的依赖关系树

正在读取状态信息... 完成

将会同时安装下列软件：

......

......

......

done.

正在处理用于 libc-bin (2.23-0ubuntu7) 的触发器 ...

正在处理用于 ca-certificates (20160104ubuntu1) 的触发器 ...

Updating certificates in /etc/ssl/certs...

0 added, 0 removed; done.

Running hooks in /etc/ca-certificates/update.d...

done.

done.

hadoop@Ubuntu1604:~$

hadoop@Ubuntu1604:~$ dpkg -l |grep jdk

ii  openjdk-8-jdk:amd64                        8u121-b13-0ubuntu1.16.04.2                    amd64        OpenJDK Development Kit (JDK)

ii  openjdk-8-jdk-headless:amd64               8u121-b13-0ubuntu1.16.04.2                    amd64        OpenJDK Development Kit (JDK) (headless)

ii  openjdk-8-jre:amd64                        8u121-b13-0ubuntu1.16.04.2                    amd64        OpenJDK Java runtime, using Hotspot JIT

ii  openjdk-8-jre-headless:amd64               8u121-b13-0ubuntu1.16.04.2                    amd64        OpenJDK Java runtime, using Hotspot JIT (headless)

hadoop@Ubuntu1604:~$

hadoop@Ubuntu1604:~$ dpkg -L openjdk-8-jdk | grep '/bin$'

/usr/lib/jvm/java-8-openjdk-amd64/bin

hadoop@Ubuntu1604:~$

hadoop@Ubuntu1604:~$ vim ~/.bashrc

hadoop@Ubuntu1604:~$

hadoop@Ubuntu1604:~$ head ~/.bashrc |grep java

export JAVA_HOME="/usr/lib/jvm/java-8-openjdk-amd64"

hadoop@Ubuntu1604:~$

hadoop@Ubuntu1604:~$ source ~/.bashrc

hadoop@Ubuntu1604:~$

hadoop@Ubuntu1604:~$ echo $JAVA_HOME

/usr/lib/jvm/java-8-openjdk-amd64

hadoop@Ubuntu1604:~$

hadoop@Ubuntu1604:~$ java -version

openjdk version "1.8.0_121"

OpenJDK Runtime Environment (build 1.8.0_121-8u121-b13-0ubuntu1.16.04.2-b13)

OpenJDK 64-Bit Server VM (build 25.121-b13, mixed mode)

hadoop@Ubuntu1604:~$

安装Hadoop

hadoop@Ubuntu1604:~$ sudo tar -zxf ~/hadoop-2.8.0.tar.gz -C /usr/local

[sudo] hadoop 的密码：

hadoop@Ubuntu1604:~$ cd /usr/local

hadoop@Ubuntu1604:/usr/local$ sudo mv ./hadoop-2.8.0/ ./hadoop

hadoop@Ubuntu1604:/usr/local$ sudo chown -R hadoop ./hadoop

hadoop@Ubuntu1604:/usr/local$ ls -l |grep hadoop

drwxr-xr-x 9 hadoop dialout 4096 3月  17 13:31 hadoop

hadoop@Ubuntu1604:/usr/local$ cd ./hadoop

hadoop@Ubuntu1604:/usr/local/hadoop$ ls -l

总用量 148

drwxr-xr-x 2 hadoop dialout  4096 3月  17 13:31 bin

drwxr-xr-x 3 hadoop dialout  4096 3月  17 13:31 etc

drwxr-xr-x 2 hadoop dialout  4096 3月  17 13:31 include

drwxr-xr-x 3 hadoop dialout  4096 3月  17 13:31 lib

drwxr-xr-x 2 hadoop dialout  4096 3月  17 13:31 libexec

-rw-r--r-- 1 hadoop dialout 99253 3月  17 13:31 LICENSE.txt

-rw-r--r-- 1 hadoop dialout 15915 3月  17 13:31 NOTICE.txt

-rw-r--r-- 1 hadoop dialout  1366 3月  17 13:31 README.txt

drwxr-xr-x 2 hadoop dialout  4096 3月  17 13:31 sbin

drwxr-xr-x 4 hadoop dialout  4096 3月  17 13:31 share

hadoop@Ubuntu1604:/usr/local/hadoop$ ./bin/hadoop version

Hadoop 2.8.0

Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 91f2b7a13d1e97be65db92ddabc627cc29ac0009

Compiled by jdu on 2017-03-17T04:12Z

Compiled with protoc 2.5.0

From source with checksum 60125541c2b3e266cbf3becc5bda666

This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-2.8.0.jar

hadoop@Ubuntu1604:/usr/local/hadoop$

运行Hadoop单机配置下的grep示例

Hadoop 默认模式为非分布式模式（本地模式），无需进行其他配置即可运行。非分布式即单 Java 进程，方便进行调试。

hadoop@Ubuntu1604:~$ cd /usr/local/hadoop/

hadoop@Ubuntu1604:/usr/local/hadoop$ mkdir ./input

hadoop@Ubuntu1604:/usr/local/hadoop$ cp ./etc/hadoop/*.xml ./input/

hadoop@Ubuntu1604:/usr/local/hadoop$ ls -l input/

总用量 56

drwxrwxr-x  2 hadoop hadoop  4096 4月  27 22:23 ./

drwxr-xr-x 10 hadoop dialout 4096 4月  27 22:23 ../

-rw-r--r--  1 hadoop hadoop  4942 4月  27 22:23 capacity-scheduler.xml

-rw-r--r--  1 hadoop hadoop   774 4月  27 22:23 core-site.xml

-rw-r--r--  1 hadoop hadoop  9683 4月  27 22:23 hadoop-policy.xml

-rw-r--r--  1 hadoop hadoop   775 4月  27 22:23 hdfs-site.xml

-rw-r--r--  1 hadoop hadoop   620 4月  27 22:23 httpfs-site.xml

-rw-r--r--  1 hadoop hadoop  3518 4月  27 22:23 kms-acls.xml

-rw-r--r--  1 hadoop hadoop  5546 4月  27 22:23 kms-site.xml

-rw-r--r--  1 hadoop hadoop   690 4月  27 22:23 yarn-site.xml

hadoop@Ubuntu1604:/usr/local/hadoop$ ./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.0.jar grep ./input ./output 'dfs[a-z.]+'

17/04/27 22:29:45 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id

17/04/27 22:29:45 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=

17/04/27 22:29:45 INFO input.FileInputFormat: Total input files to process : 8

17/04/27 22:29:45 INFO mapreduce.JobSubmitter: number of splits:8

......

......

......

17/04/27 22:29:49 INFO mapreduce.Job: Counters: 30

	File System Counters

		FILE: Number of bytes read=1273712

		FILE: Number of bytes written=2504878

		FILE: Number of read operations=0

		FILE: Number of large read operations=0

		FILE: Number of write operations=0

	Map-Reduce Framework

		Map input records=1

		Map output records=1

		Map output bytes=17

		Map output materialized bytes=25

		Input split bytes=121

		Combine input records=0

		Combine output records=0

		Reduce input groups=1

		Reduce shuffle bytes=25

		Reduce input records=1

		Reduce output records=1

		Spilled Records=2

		Shuffled Maps =1

		Failed Shuffles=0

		Merged Map outputs=1

		GC time elapsed (ms)=0

		Total committed heap usage (bytes)=1054867456

	Shuffle Errors

		BAD_ID=0

		CONNECTION=0

		IO_ERROR=0

		WRONG_LENGTH=0

		WRONG_MAP=0

		WRONG_REDUCE=0

	File Input Format Counters

		Bytes Read=123

	File Output Format Counters

		Bytes Written=23

hadoop@Ubuntu1604:/usr/local/hadoop$

hadoop@Ubuntu1604:/usr/local/hadoop$ ls -l ./output/

总用量 4

-rw-r--r-- 1 hadoop hadoop 11 4月  27 22:29 part-r-00000

-rw-r--r-- 1 hadoop hadoop  0 4月  27 22:29 _SUCCESS

hadoop@Ubuntu1604:/usr/local/hadoop$

hadoop@Ubuntu1604:/usr/local/hadoop$ cat ./output/*

1	dfsadmin

hadoop@Ubuntu1604:/usr/local/hadoop$

Hadoop 默认不会覆盖结果文件，再次运行前需要先将output目录删除。

hadoop@Ubuntu1604:/usr/local/hadoop$ rm -rf ./output

Hadoop附带示例

hadoop@Ubuntu1604:/usr/local/hadoop$ ./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.0.jar

An example program must be given as the first argument.

Valid program names are:

  aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.

  aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.

  bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.

  dbcount: An example job that count the pageview counts from a database.

  distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.

  grep: A map/reduce program that counts the matches of a regex in the input.

  join: A job that effects a join over sorted, equally partitioned datasets

  multifilewc: A job that counts words from several files.

  pentomino: A map/reduce tile laying program to find solutions to pentomino problems.

  pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.

  randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.

  randomwriter: A map/reduce program that writes 10GB of random data per node.

  secondarysort: An example defining a secondary sort to the reduce.

  sort: A map/reduce program that sorts the data written by the random writer.

  sudoku: A sudoku solver.

  teragen: Generate data for the terasort

  terasort: Run the terasort

  teravalidate: Checking results of terasort

  wordcount: A map/reduce program that counts the words in the input files.

  wordmean: A map/reduce program that counts the average length of the words in the input files.

  wordmedian: A map/reduce program that counts the median length of the words in the input files.

  wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.

hadoop@Ubuntu1604:/usr/local/hadoop$

Data - Hadoop单机配置 - 使用Hadoop2.8.0和Ubuntu16.04的更多相关文章

Data - Hadoop伪分布式配置 - 使用Hadoop2.8.0和Ubuntu16.04
系统版本 anliven@Ubuntu1604:~$ uname -a Linux Ubuntu1604 4.8.0-36-generic #36~16.04.1-Ubuntu SMP Sun Feb ...
Hadoop集群搭建-Hadoop2.8.0安装(三)
一.准备安装介质 a).hadoop-2.8.0.tar b).jdk-7u71-linux-x64.tar 二.节点部署图三.安装步骤环境介绍: 主服务器ip:192.168.80.128(ma ...
Linux上安装Hadoop集群(CentOS7+hadoop-2.8.0)--------hadoop环境的搭建
Linux上安装Hadoop集群(CentOS7+hadoop-2.8.0)------https://blog.csdn.net/pucao_cug/article/details/71698903 ...
Tensorflow1.5.0+cuda9.0+cudnn7.0+gtx1080+ubuntu16.04
目录 Tensorflow1.5.0+cuda9.0+cudnn7.0+gtx1080+ubuntu16.04 0. 前记 1. 环境说明 2. 安装GTX1080显卡驱动 3. CUDA 9.0安装 ...
在Ubuntu下配置运行Hadoop2.4.0单节点配置
还没有修改hosts,请先按前文修改. 还没安装java的,请按照前文配置. (1)增加用户并设立公钥: sudo addgroup hadoop sudo adduser --ingroup had ...
Linux上安装Hadoop集群(CentOS7+hadoop-2.8.0)
1下载hadoop 2安装3个虚拟机并实现ssh免密码登录 2.1安装3个机器 2.2检查机器名称 2.3修改/etc/hosts文件 2.4 给3个机器生成秘钥文件 2.5 在hserver1上创建 ...
hadoop单机配置
条件: 先下载VMware1.2,然后安装. 下载ubuntu-1.4.05-desktop-amd64.iso.下载地址:http://mirrors.aliyun.com/ubuntu-relea ...
spark 1.6.0 安装与配置（spark1.6.0、Ubuntu14.04、hadoop2.6.0、scala2.10.6、jdk1.7）
前几天刚着实研究spark,spark安装与配置是入门的关键,本人也是根据网上各位大神的教程,尝试配置,发现版本对应最为关键.现将自己的安装与配置过程介绍如下,如有兴趣的同学可以尝试安装.所谓工欲善其 ...
0、ubuntu16.04安装部署kvm
ubuntu16.04安装部署kvm1.查看CPU是否支持KVM egrep "(svm|vmx)" /proc/cpuinfo 2.安装相关kvm包 sudo apt-get i ...

随机推荐

Rest架构风格
一.REST介绍:: 1.REST是英文 Representational State Transfer的缩写 -- 表象化状态转变或者表述性状态转移 1.1 REST是 Web服务的一种架构风格 ...
php socket通过smtp发送邮件（纯文本、HTML，多收件人，多抄送，多密送）
<?php /** * 邮件发送类 * 支持发送纯文本邮件和HTML格式的邮件,可以多收件人,多抄送,多秘密抄送 * @example * $mail = new MySendMail(); * ...
java保存繁体字到数据库时就报错Incorrect string value: '\xF0\xA6\x8D\x8B\xE5\xA4...' for column 'name' at row 1
问题分析普通的字符串或者表情都是占位3个字节,所以utf8足够用了,但是移动端的表情符号占位是4个字节,普通的utf8就不够用了,为了应对无线互联网的机遇和挑战.避免 emoji 表情符号带来的问题 ...
n维向量空间W中有子空间U,V,如果dim(U)=r dim(V)=n-r U交V !={0},那么U,V的任意2组基向量的组合必定线性相关
如题取U交V中的向量p (p!=0), 那么p可以由 U中的某一组基线性组合成(系数不全是零),同时,-p也可以由V中的某一组基线性组合成(系数不全为零) 考察p+(-p)=0 可知道,U中的这组基跟 ...
安装完ubuntu后需要安装的软件
ubuntu安装完sudo apt-get install vim g++ openssh-server libgl1-mesa-dev vmtools
Linux+mysql+apache+php
1.1.1 所需软件 cmake ncourse mysql apr apr-util pcre apache php 1.1.2 解压缩软件 ...
vue动态路由
我们经常需要把某种模式匹配到的所有路由,全都映射到同个组件.例如,我们有一个 User 组件,对于所有 ID 各不相同的用户,都要使用这个组件来渲染.能够提供参数的路由即为动态路由第一步:定义组件 c ...
tinyweb集成springmvc 的一种可行方式
最近tiny项目中集成了springmvc,而且使用的tiny的版本比较低,所以整合起来官网给的前两种方式都行不通. 而且有个tiny整合springmvc的maven依赖都下载不了.所以只有使用第三 ...
C语言实现BMP图片生成
## #include <stdio.h> #include <stdlib.h> #include <string.h> typedef unsigned cha ...
（转）Eclipse开发Web项目
1. 建立最简单的JSP和servlet http://wenku.baidu.com/link?url=bcf8iwB3E5_gjl46WfZAekQUWsps0-G3MAbbKz5totQcvmS ...

Data - Hadoop单机配置 - 使用Hadoop2.8.0和Ubuntu16.04

系统版本

创建hadoop用户

更新apt及安装vim

配置SSH免密码登录

安装Java

安装Hadoop

运行Hadoop单机配置下的grep示例

Hadoop附带示例

Data - Hadoop单机配置 - 使用Hadoop2.8.0和Ubuntu16.04的更多相关文章

随机推荐

热门专题