大数据安全系列的其它文章

https://www.cnblogs.com/bainianminguo/p/12548076.html-----------安装kerberos

https://www.cnblogs.com/bainianminguo/p/12548334.html-----------hadoop的kerberos认证

https://www.cnblogs.com/bainianminguo/p/12548175.html-----------zookeeper的kerberos认证

https://www.cnblogs.com/bainianminguo/p/12584732.html-----------hive的kerberos认证

https://www.cnblogs.com/bainianminguo/p/12584880.html-----------es的search-guard认证

https://www.cnblogs.com/bainianminguo/p/12639821.html-----------flink的kerberos认证

https://www.cnblogs.com/bainianminguo/p/12639887.html-----------spark的kerberos认证

一、安装hadoop

1、解压安装包重命名安装目录

[root@cluster2_host1 data]# tar -zxvf hadoop-2.7.1.tar.gz -C /usr/local/
[root@cluster2_host1 local]# mv hadoop-2.7.1/ hadoop

  

2、设置hadoop的环境变量

[root@cluster2_host1 bin]# vim /etc/profile

export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:/usr/local/hadoop/bin

  

3、添加hdfs用户并修改hdfs的属组

   65  groupadd hdfs
66 useradd hdfs -g hdfs
67 cat /etc/passwd
68 chown -R hdfs:hdfs /usr/local/hadoop/
69 chown -R hdfs:hdfs /usr/local/hadoop/

  

4、修改hdfs配置文件

vim core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://cluster2_host1:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/data/vdb1/tmp</value>
</property>
</configuration>

  

vim mapred-site.xml

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>cluster2_host1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>cluster2_host1:19888</value>
</property>
</configuration>

  

vim hdfs-site.xml

<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/data/vdb1/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/data/vdb1/data</value>
</property>
<property>
<name>dfs.secondary.http.address</name>
<value>cluster2_host2:50090</value>
</property>
</configuration>

  

vim yarn-site.xml

<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>cluster2_host1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>cluster2_host1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>cluster2_host1:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>cluster2_host1:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>cluster2_host1:8088</value>
</property>
</configuration>

  

修改slaves文件

[root@cluster2_host1 hadoop]# cat slaves
cluster2_host1
cluster2_host3
cluster2_host2

  

5、创建目录和修改属组

[root@cluster2_host3 bin]# groupadd hdfs
[root@cluster2_host3 bin]# useradd hdfs -g hdfs
[root@cluster2_host3 bin]# mkdir /data/vdb1/tmp
[root@cluster2_host3 bin]# mkdir /data/vdb1/data
[root@cluster2_host3 bin]# mkdir /data/vdb1/name
[root@cluster2_host3 bin]# chown -R hdfs:hdfs /data/vdb1/tmp/
[root@cluster2_host3 bin]# chown -R hdfs:hdfs /data/vdb1/data
[root@cluster2_host3 bin]# chown -R hdfs:hdfs /data/vdb1/name
[root@cluster2_host3 bin]# chown -R hdfs:hdfs /usr/local/hadoop/

  

6、拷贝安装目录到其他节点

171  scp -r hadoop/ root@cluster2_host2:/usr/local/
172 scp -r hadoop/ root@cluster2_host3:/usr/local/

  

7、格式化hdfs

[root@cluster2_host1 local]# hdfs namenode -format

  

8、启动yarn

[root@cluster2-host1 sbin]# ./start-yarn.sh

  

9、启动hdfs

[root@cluster2-host1 sbin]# ./start-dfs.sh

  

10、检查进程

[root@cluster2-host1 data]# jps
10004 DataNode
29432 ResourceManager
8942 Jps
9263 NameNode
30095 NodeManager

  

二、hdfs配置kerberos认证

1、所有节点安装autoconf

yum install autoconf -y

  

2、所有节点安装gcc

yum install gcc -y

  

3、安装jsvc

542  tar -zxvf commons-daemon-1.2.2-src.tar.gz
543 /data/commons-daemon-1.2.2-src/src/native/unix 554 ./support/buildconf.sh
555 ./configure
556 make

  

检查是否安装完成

[root@cluster2-host1 unix]# ./jsvc -help
Usage: jsvc [-options] class [args...] Where options include: -help | --help | -?
show this help page (implies -nodetach)
-jvm <JVM name>
use a specific Java Virtual Machine. Available ln -s /data/commons-daemon-1.2.2-src/src/native/unix/jsvc /usr/local/bin/jsvc

  

4、修改hdfs-env.sh的配置文件

vim /usr/local/hadoop/etc/hadoop/hadoop-env.sh

export JSVC_HOME=/data/commons-daemon-1.2.2-src/src/native/unix

export HADOOP_SECURE_DN_USER=hdfs

  

分发到其他节点

5、创建hdfs的principal

kadmin.local:  addprinc hdfs/cluster2-host1
kadmin.local: addprinc hdfs/cluster2-host2
kadmin.local: addprinc hdfs/cluster2-host3
kadmin.local: addprinc http/cluster2-host1
kadmin.local: addprinc http/cluster2-host2
kadmin.local: addprinc http/cluster2-host3 kadmin.local: ktadd -norandkey -k /etc/security/keytab/hdfs.keytab hdfs/cluster2-host1
kadmin.local: ktadd -norandkey -k /etc/security/keytab/hdfs.keytab hdfs/cluster2-host2
kadmin.local: ktadd -norandkey -k /etc/security/keytab/hdfs.keytab hdfs/cluster2-host3
kadmin.local: ktadd -norandkey -k /etc/security/keytab/http.keytab http/cluster2-host1
kadmin.local: ktadd -norandkey -k /etc/security/keytab/http.keytab http/cluster2-host2
kadmin.local: ktadd -norandkey -k /etc/security/keytab/http.keytab http/cluster2-host3

  

6、分发秘钥文件

[root@cluster2-host1 etc]# scp hdfs.keytab http.keytab root@cluster2-host2:/usr/local/hadoop/etc/
hdfs.keytab 100% 1559 1.5KB/s 00:00
http.keytab 100% 1559 1.5KB/s 00:00
[root@cluster2-host1 etc]# scp hdfs.keytab http.keytab root@cluster2-host3:/usr/local/hadoop/etc/
hdfs.keytab 100% 1559 1.5KB/s 00:00
http.keytab

  

7、修改hdfs的配置文件

修改core-site.xml文件

   <property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
</property>

  

修改修改hdfs-site.xml

<property>
<name>dfs.block.access.token.enable</name>
<value>true</value>
</property>
<property>
<name>dfs.namenode.kerberos.principal</name>
<value>hdfs/cluster2-host1@HADOOP.COM</value>
</property>
<property>
<name>dfs.namenode.keytab.file</name>
<value>/usr/local/hadoop/etc/hdfs.keytab</value>
</property>
<property>
<name>dfs.namenode.kerberos.internal.spnego.principal</name>
<value>http/hadoop@HADOOP.COM</value>
</property>
<property>
<name>dfs.namenode.kerberos.internal.spnego.keytab</name>
<value>http/cluster2-host1@HADOOP.COM</value>
</property>
<property>
<name>dfs.web.authentication.kerberos.principal</name>
<value>hdfs/cluster2-host1@HADOOP.COM</value>
</property>
<property>
<name>dfs.web.authentication.kerberos.keytab</name>
<value>/usr/local/hadoop/etc/hdfs.keytab</value>
</property>
<property>
<name>dfs.datanode.kerberos.principal</name>
<value>hdfs/cluster2-host1@HADOOP.COM</value>
</property>
<property>
<name>dfs.datanode.keytab.file</name>
<value>/usr/local/hadoop/etc/hdfs.keytab</value>
</property>
<property>
<name>dfs.datanode.address</name>
<value>0.0.0.0:1004</value>
</property>
<property>
<name>dfs.datanode.http.address</name>
<value>0.0.0.0:1006</value>
</property>

  

如果有secondnamenode,则还需要加下面的配置

<property>
<name>dfs.secondary.namenode.keytab.file</name>
<value>/usr/local/hadoop/etc/hdfs.keytab</value>
</property>
<property>
<name>dfs.secondary.namenode.kerberos.principal</name>
<value>hdfs/cluster2-host1@HADOOP.COM</value>
</property>

  

修改yarn-site.xml

<property>
<name>yarn.resourcemanager.principal</name>
<value>hdfs/cluster2-host1@HADOOP.COM</value>
</property>
<property>
<name>yarn.resourcemanager.keytab</name>
<value>/usr/local/hadoop/etc/hdfs.keytab</value>
</property>
<property>
<name>yarn.nodemanager.keytab</name>
<value>/usr/local/hadoop/etc/hdfs.keytab</value>
</property>
<property>
<name>yarn.nodemanager.principal</name>
<value>hdfs/cluster2-host1@HADOOP.COM</value>
</property>

  

分发配置文件到其他节点

[root@cluster2-host1 hadoop]# scp core-site.xml hdfs-site.xml yarn-site.xml root@cluster2-host2:/usr/local/hadoop/etc/hadoop/
core-site.xml 100% 1241 1.2KB/s 00:00
hdfs-site.xml 100% 2544 2.5KB/s 00:00
yarn-site.xml 100% 2383 2.3KB/s 00:00
[root@cluster2-host1 hadoop]# scp core-site.xml hdfs-site.xml yarn-site.xml root@cluster2-host3:/usr/local/hadoop/etc/hadoop/
core-site.xml 100% 1241 1.2KB/s 00:00
hdfs-site.xml 100% 2544 2.5KB/s 00:00
yarn-site.xml

  

8、启动hdfs

Hdfs用户执行下面的脚本

start-dfs.sh

[root@cluster2-host1 sbin]#
[root@cluster2-host1 sbin]# jps
32595 Secur
30061 Jps
28174 NameNode

  

Root用户执行下面的脚本

./start-secure-dns.sh

检查进程,这里需要注意,jps是看不到datenode的进程的

[root@cluster2-host1 sbin]# ps auxf |grep datanode

  

9、验证

[root@cluster2-host1 hadoop]# hdfs dfs -ls /
20/03/03 08:06:40 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
ls: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "cluster2-host1/10.87.18.34"; destination host is: "cluster2-host1":9000;
[root@cluster2-host1 hadoop]# kinit -kt /etc/security/keytab/hdfs.keytab hdfs/cluster2-host1
[root@cluster2-host1 hadoop]# hdfs dfs -ls /
Found 4 items
drwxr-xr-x - root supergroup 0 2020-03-02 06:25 /flink
drwxr-xr-x - root supergroup 0 2020-03-02 04:30 /spark_jars
drwx-wx-wx - root supergroup 0 2020-03-02 21:12 /tmp
drwxr-xr-x - root supergroup 0 2020-03-02 21:11 /user

  

三、配置yarn的kerberos认证

1、配置yarn-site.xml配置文件

<property>
<name>yarn.nodemanager.container-executor.class</name>
<value>org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor</value>
</property>
<property>
<name>yarn.nodemanager.linux-container-executor.group</name>
<value>hdfs</value>
</property>
<property>
<name>yarn.nodemanager.linux-container-executor.path</name>
<value>/bin/container-executor</value>
</property>

  

yarn.nodemanager.linux-container-executor.path指定了container-executor的路径,container-executor是可执行二进制文件,它需要一个配置文件:

yarn.nodemanager.linux-container-executor.group是nodemanager的启动用户所属组

2、确认container-executor路径

[root@cluster2-host1 bin]# strings container-executor |grep etc
../etc/hadoop/container-executor.cfg
[root@cluster2-host1 bin]# cd /usr/local/hadoop/bin/
[root@cluster2-host1 bin]# ll
total 448
-rwxr-xr-x. 1 hdfs hdfs 160127 Jun 29 2015 container-executor
-rwxr-xr-x. 1 hdfs hdfs 6488 Jun 29 2015 hadoop
-rwxr-xr-x. 1 hdfs hdfs 8786 Jun 29 2015 hadoop.cmd
-rwxr-xr-x. 1 hdfs hdfs 12223 Jun 29 2015 hdfs
-rwxr-xr-x. 1 hdfs hdfs 7327 Jun 29 2015 hdfs.cmd
-rwxr-xr-x. 1 hdfs hdfs 5953 Jun 29 2015 mapred
-rwxr-xr-x. 1 hdfs hdfs 6310 Jun 29 2015 mapred.cmd
-rwxr-xr-x. 1 hdfs hdfs 1776 Jun 29 2015 rcc
-rwxr-xr-x. 1 hdfs hdfs 204075 Jun 29 2015 test-container-executor
-rwxr-xr-x. 1 hdfs hdfs 13308 Jun 29 2015 yarn
-rwxr-xr-x. 1 hdfs hdfs 11386 Jun 29 2015 yarn.cmd

  

3、创建目录,拷贝可执行文件和配置文件到指定目录

[root@cluster2-host1 bin]# mkdir -p /hdp/bin
[root@cluster2-host1 bin]# mkdir -p /hdp/etc/hadoop
[root@cluster2-host1 bin]# scp /usr/local/hadoop/bin/container-executor /hdp/bin/
[root@cluster2-host1 bin]# scp /usr/local/hadoop/etc/hadoop/container-executor.cfg /hdp/etc/hadoop/

  

修改配置文件的内容如下

yarn.nodemanager.linux-container-executor.group=hdfs
banned.users=mysql
min.user.id=500
allowed.system.users=root

  

4、修改可执行文件的属组

[root@cluster2-host1 hadoop]# ll /hdp/bin/container-executor
-rwxr-xr-x. 1 root hdfs 160127 Mar 3 20:12 /hdp/bin/container-executor
[root@cluster2-host1 hadoop]# ll /hdp/etc/hadoop/
total 4
-rw-r--r--. 1 root root 318 Mar 3 20:13 container-executor.cfg
[root@cluster2-host1 hadoop]#

  

修改权限

[root@cluster2-host1 hadoop]# chmod 6050 /hdp/bin/container-executor
[root@cluster2-host1 hadoop]# ll /hdp/bin/container-executor
---Sr-s---. 1 root hdfs 160127 Mar 3 20:12 /hdp/bin/container-executor

  

5、做如下检查,如果输出一致,则container-executor配置完成

[root@cluster2-host1 hadoop]# hadoop checknative
20/03/03 20:29:41 WARN bzip2.Bzip2Factory: Failed to load/initialize native-bzip2 library system-native, will use pure-Java version
20/03/03 20:29:41 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
Native library checking:
hadoop: true /usr/local/hadoop/lib/native/libhadoop.so.1.0.0
zlib: true /lib64/libz.so.1
snappy: true /lib64/libsnappy.so.1
lz4: true revision:99
bzip2: false
openssl: false Cannot load libcrypto.so (libcrypto.so: cannot open shared object file: No such file or directory)!
[root@cluster2-host1 hadoop]# /hdp/bin/container-executor --checksetup
[root@cluster2-host1 hadoop]#

  

6、拷贝hdp目录 到其他节点,需要设置相同的属组和权限

[root@cluster2-host1 sbin]# scp /hdp/etc/hadoop/container-executor.cfg root@cluster2-host2:/hdp/etc/hadoop/
container-executor.cfg

  

7、启动yarn

[root@cluster2-host1 sbin]# ./start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-root-resourcemanager-cluster2-host1.out
cluster2-host3: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-cluster2-host3.out
cluster2-host2: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-cluster2-host2.out
cluster2-host1: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-cluster2-host1.out

  

8、验证yarn on kerberos配置完成,能正常执行即可

[root@cluster2-host1 hadoop]# ./bin/hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount /input /output

  

输出如下

[root@cluster2-host1 hadoop]# hdfs dfs -ls /output
Found 2 items
-rw-r--r-- 2 hdfs supergroup 0 2020-03-03 21:40 /output/_SUCCESS
-rw-r--r-- 2 hdfs supergroup

  

kerberos系列之hdfs&yarn认证配置的更多相关文章

  1. kerberos系列之zookeeper的认证配置

    本篇博客介绍配置zookeeper的kerberos配置 一.zookeeper安装 1.解压安装包和重命名和创建数据目录 tar -zxvf /data/apache-zookeeper-3.5.5 ...

  2. API网关Kong系列(四)认证配置

    目前根据业务需要先介绍2种认证插件:Key Authentication 及 HMAC-SHA1 认证  Key Authentication 向API添加密钥身份验证(也称为API密钥). 然后,消 ...

  3. kerberos系列之hive认证配置

    大数据安全系列之hive的kerberos认证配置,其它系列链接如下 https://www.cnblogs.com/bainianminguo/p/12548076.html-----------安 ...

  4. MapReduce On Yarn的配置详解和日常维护

    MapReduce On Yarn的配置详解和日常维护 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.MapReduce运维概述 MapReduce on YARN的运维主要是 ...

  5. Hadoop 系列(二)安装配置

    Hadoop 系列(二)安装配置 Hadoop 官网:http://hadoop.apache.or 一.Hadoop 安装 1.1 Hadoop 依赖的组件 JDK :从 Oracle 官网下载,设 ...

  6. Flink on yarn的配置及执行

    1. 写在前面 Flink被誉为第四代大数据计算引擎组件,即可以用作基于离线分布式计算,也可以应用于实时计算.Flink可以自己搭建集群模式已提供为庞大数据的计算.但在实际应用中.都是计算hdfs上的 ...

  7. Hadoop Yarn环境配置

    抄一个可行的Hadoop Yarn环境配置.用的官方的2.2.0版本. http://www.jdon.com/bigdata/yarn.html Hadoop 2.2新特性 将Mapreduce框架 ...

  8. 第九章 搭建Hadoop 2.2.0版本HDFS的HA配置

    Hadoop中的NameNode好比是人的心脏,非常重要,绝对不可以停止工作.在hadoop1时代,只有一个NameNode.如果该NameNode数据丢失或者不能工作,那么整个集群就不能恢复了.这是 ...

  9. 第7章 YARN HA配置

    目录 7.1 yarn-site.xm文件配置 7.2 测试YARN自动故障转移 ResourceManager (RM)负责跟踪集群中的资源,以及调度应用程序(例如,MapReduce作业).在Ha ...

随机推荐

  1. ES介绍与实践

    一.ES介绍 1.基础概念介绍 1. 索引:Elasticsearch中的“索引”有点像关系数据库中的数据库. 它是存储/索引数据的地方: 2.分片 shard “分片”是Lucene的一个索引. 它 ...

  2. 这些科学家用DNA做的鲜为人知事,你估计都没见过!

    DNA世界的每一步都给人类带来奇妙甚至吃惊的发现.研究人员越来越多地探索和掌握了生命中的分子.生物与技术之间的界限以前所未有的方式模糊,有时甚至更糟.但DNA也为复杂疾病带来简单的答案,存储奇怪的文件 ...

  3. Android 粘合剂'Binder'

    背景知识 要详细掌握Android 的Binder通信机制需要先提前了解一些通信原理与Linux系统的基础知识. RPC RPC(Remote Procedure Call),即远程过程调用,也被称为 ...

  4. swap和shm的区别

    在使用docker的过程中,发现其有很多内存相关的命令,对其中的swap(交换内存)和shm(共享内存)尤其费解.于是查阅了一些资料,弄明白了二者的基本区别. swap 是一个文件,是使用硬盘空间的一 ...

  5. CSS——NO.8(代码简写)

    */ * Copyright (c) 2016,烟台大学计算机与控制工程学院 * All rights reserved. * 文件名:text.cpp * 作者:常轩 * 微信公众号:Worldhe ...

  6. 上周 GitHub 热点速览 vol.09:手撕 LeetCode 一日 star 破两千

    作者:HelloGitHub-小鱼干 摘要(用于 公众号/博客园等地方):上周 GitHub 趋势榜相较上上周就如同前故事一般,跌到不行,无论是新晋开源小项,还是坚挺老项目,Star 增长量都不如之前 ...

  7. Solr查询配置及优化【eDisMax查询解析器】

    一.简介 Lucene查询解析器语法支持创建任意复杂的布尔查询,但还有一些缺点,它不是用户查询处理的理想解决方案.这里面最大的问题是Lucene查询解析器的语法要求严格,一旦破坏就会抛出异常.指望用户 ...

  8. Scrum 敏捷实践中的三大角色

    在我过去的近两年工作中,我们一直在应用 Scrum 敏捷项目管理方法来开展工作,今天,我先从它的角色划分来讲起,毕竟这可是它最鲜明的特征. 首先,为什么这种项目管理方法叫 Scrum ? Scrum ...

  9. GO - if判断,for循环,switch语句,数组的使用

    1.if - else if - else的使用 package main import "fmt" func main() { // 1.简单使用 var a=10 if a== ...

  10. 6,HDFS HA

    目录 HDFS HA 一.HA(High Availability)的使用原因 二.HA的同步 三.HA的自动容灾 HDFS HA 一.HA(High Availability)的使用原因 1.1 在 ...