构建高可靠hadoop集群之4-保全模式
本文主要翻译自http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/SecureMode.html
译注:之所以不翻译为安全模式,是因为namenode启动的时候有个safemode,如果都翻译为安全模式,会有混淆的顾虑。
从重要程度看,本文所阐述的内容,无疑相当高。 但从配置上看,kerberos的配置,可谓hadoop集群搭建中最为繁琐,麻烦的部分。
引入一个kerberos认证,需要额外搭建一个kerberos服务器,容易制造新的单点故障,这有违背ha的初衷。除非可以搭建kerberos集权,但这变得更加复杂了。
如果觉得kerberos太麻烦,可以只选择服务级别授权、web控制台的认证和数据保密。
作者翻译本文的时候,尚未有在hadoop上配置kerberos的经验。所以本文仅仅作为一个学习笔记存在,用作加深记忆,用作日后的参考。日后如果有实际的操作,会把有关的配置附录在后。
1.介绍
本文描述如何在保全模式下配置授权。当hadoop在保全模式下运行的时候,每个hadoop服务和每个用户必须通过kerberos授权。
正确配置服务主机寻找,以便允许服务器相互认证。主机寻找可以通过dns或者/etc/hosts来实现。我们推荐通过kerberos和dns来配置hadoop保全模式。
安全特性包含认证,服务级别授权,web控制台认证和数据保密
2.认证
2.1终端用户账户
当服务级别的认证开启后,终端用户必须在和hadoop服务交互前线进行认证。对简单的用户认证途径就是使用kerberos kinit命令。
如果无法使用kinit交互登录,那么可以使用kerberos keytab文件。
2.2hadoop守护程序用户账户
确保hdfs和yarn的守护程序使用不同的unix用户(或者是不同的操作系统用户),例如分别使用hdfs,yarn。
此外,确保mr jobhistory 服务器使用不用的用户,例如mapred.
同时,我们推荐它们使用相同的unix组,例如hadoop.
下表就是推荐的用户和分组:
User:Group | Daemons |
---|---|
hdfs:hadoop | NameNode, Secondary NameNode, JournalNode, DataNode |
yarn:hadoop | ResourceManager, NodeManager |
mapred:hadoop | MapReduce JobHistory Server |
译注:当原文“建议”的时候,通常就是“最好那么做”的意思。如果不按照“建议”来做,可能也可以,但可能有许多曲折。
2.3hadoop守护程序的kerberos主体
每个hadoop服务实例必须配置对应kerberos主体和keytab文件位置。
服务主体的通用格式是:ServiceName/_HOST@REALM.TLD. 例如dn/_HOST@EXAMPLE.COM
hadoop简化了这种配置文件的部署,它允许各种主机的部分使用_HOST通配符。每个服务实例会再运行时,使用自己的主机全称替代_HOST.这样管理员就可以在所有节点上部署同套配置文件。
但,keytab文件还是必须定制-每台都不同。
2.3.1 HDFS的ketab文件
名称节点的
$ klist -e -k -t /etc/security/keytab/nn.service.keytab
Keytab name: FILE:/etc/security/keytab/nn.service.keytab
KVNO Timestamp Principal
4 07/18/11 21:08:09 nn/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 nn/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 nn/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
译注:其中红色字体是真实有用的。
第二名称节点的
$ klist -e -k -t /etc/security/keytab/sn.service.keytab
Keytab name: FILE:/etc/security/keytab/sn.service.keytab
KVNO Timestamp Principal
4 07/18/11 21:08:09 sn/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 sn/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 sn/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
数据节点的
$ klist -e -k -t /etc/security/keytab/dn.service.keytab
Keytab name: FILE:/etc/security/keytab/dn.service.keytab
KVNO Timestamp Principal
4 07/18/11 21:08:09 dn/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 dn/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 dn/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
译注:以上三种节点,都只对两种service名称进行定义,没有关于http的。
2.3.2 yarn的
资源管理器
$ klist -e -k -t /etc/security/keytab/rm.service.keytab
Keytab name: FILE:/etc/security/keytab/rm.service.keytab
KVNO Timestamp Principal
4 07/18/11 21:08:09 rm/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 rm/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 rm/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
节点管理器
$ klist -e -k -t /etc/security/keytab/nm.service.keytab
Keytab name: FILE:/etc/security/keytab/nm.service.keytab
KVNO Timestamp Principal
4 07/18/11 21:08:09 nm/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 nm/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 nm/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
2.3.3 mapreduce
$ klist -e -k -t /etc/security/keytab/jhs.service.keytab
Keytab name: FILE:/etc/security/keytab/jhs.service.keytab
KVNO Timestamp Principal
4 07/18/11 21:08:09 jhs/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 jhs/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 jhs/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC)
4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
2.4kerberos主体和超级系统用户账户的映射
Hadoop maps Kerberos principals to OS user (system) accounts using rules specified by hadoop.security.auth_to_local. These rules work in the same way as the auth_to_local in Kerberos configuration file (krb5.conf). In addition, Hadoop auth_to_local mapping supports the /L flag that lowercases the returned name.
The default is to pick the first component of the principal name as the system user name if the realm matches the default_realm (usually defined in /etc/krb5.conf). e.g. The default rule maps the principal host/full.qualified.domain.name@REALM.TLD to system user host. The default rule will not be appropriate for most clusters.
In a typical cluster HDFS and YARN services will be launched as the system hdfs and yarn users respectively. hadoop.security.auth_to_local can be configured as follows:
hadoop使用hadoop.security.auth_to_local来映射kerberos主体到操作系统用户。这个规则同kerberos配置文件(krb5.conf)
此外,auth_to_local映射支持/L标记--返回名称会返回小写的结果。
默认的操作是,取主体名称中的第一个部分当作系统用户名,如果领域匹配default_realm(通常定义在/etc/krb5.conf)中。例如,host/full.qualified.domain.name@REALM.TLD映射操作系统用户host。这个默认的规则并不适合于大部分集群。
在典型的集群中,hdfs和yarn服务通常使用系统用户hdfs,yarn来启动。于是hadoop.security.auth_to_local可以如下配置:
<property>
<name>hadoop.security.auth_to_local</name>
<value>
RULE:[2:$1@$0](nn/.*@.*REALM.TLD)s/.*/hdfs/
RULE:[2:$1@$0](jn/.*@.*REALM.TLD)s/.*/hdfs/
RULE:[2:$1@$0](dn/.*@.*REALM.TLD)s/.*/hdfs/
RULE:[2:$1@$0](nm/.*@.*REALM.TLD)s/.*/yarn/
RULE:[2:$1@$0](rm/.*@.*REALM.TLD)s/.*/yarn/
RULE:[2:$1@$0](jhs/.*@.*REALM.TLD)s/.*/mapred/
DEFAULT
</value>
</property>
定制的规则可以使用hadoop kername命令测试。这个命令允许使用auth_to_local的规则集来测试特定的规则。
译注:以上的规则挺难理解的。
2.5把用户映射到组
系统用户和系统组的映射机制可以通过配置hadoop.security.group.mapping实现,具体参考hadoop组映射。
有的时候,需要在hadoop的保全模式下,使用kerberos和ldap配置单点登录。
2.6代理用户
有些产品,例如Apache Oozie需要访问hadoop服务,这个时候会用到代理用户。
2.7保全的数据节点
因为数据节点的数据传输协议并不使用hadoop rpc框架,数据节点必须使用特定的端口认证它们自己,这个可以通过设定dfs.datanode.address 和dfs.datanode.http.address来实现。这种认证是基于这样的假设:攻击者无法获得数据节点主机的root权限。
当以root执行hdfs datanode 命令的时候,服务器进程首先绑定特定端口,然后使用HADOOP_SECURE_DN_USER设定的用户账户运行。这个启动的进程使用 jsvc程序(安装在JSVC_HOME)。我们必须设定环境变量 HADOOP_SECURE_DN_USER,JSVC_HOME(在hadoop-env.sh中设定)。
从2.6.0版本之后,SASL可以用于认证数据传输协议。这样,数据节点使用root启动的时候,不再要求使用jsvc,并绑定特定端口。为了启用SASL数据传输,在hdfs-site.xml中设定dfs.data.transfer.protection,为dfs.datanode.address设定普通端口即可(译注:大于1024的端口),把dfs.http.policy设置为HTTPS_ONLY,并确保在hadoop-env.sh中不要设定HADOOP_SECURE_DN_USER。如果把dfs.datanode.address绑定在特定端口,那么就无法在数据传输协议上启用SASL。这么做仅仅是为了向后兼容.
为了从一个使用root认证的现有集群迁移到使用SASL,首先确保版本是2.6.0或者更高,其次外部应用也需要更新配置,以便启用SASL。
启用了SASL的客户端可以连接到使用root认证或者SASL认证的数据节点。
最后,每个数据节点都需要修改配置,并重新启动。在迁移期间,允许一些节点用root认证,一些使用sasl认证。
3.数据保密
3.1RPC传输加密
通过把修改core-site.xml中hadoop.rpc.protection的值为 privacy ,就可以开启hadoop服务和客户端之间的数据传输。
3.2数据块传输加密
把hdfs-site.xml中dfs.encrypt.data.transfer设置为true,这样就可以激活数据节点的数据传输协议。
此外,可以配置dfs.encrypt.data.transfer.algorithm为3des或者rc4,这是设置加密算法。如果不设置,默认使用3des.
如果设置 dfs.encrypt.data.transfer.cipher.suites 为 AES/CTR/NoPadding,就可以激活AES加密。默认情况下,不设置这个,所以也不用AES.如果启用这个,那么在初始key交换的时候,依然会使用前一个参数的加密算法。
dfs.encrypt.data.transfer.cipher.key.bitlength 的值可以是128, 192 or 256. 默认是128.这个用于设定AES钥匙的长度.
AES可以提供最好的密码长度和最佳的性能。 现在,hadoop集群中更多使用3DES和RC4
译注:数据加密加重了系统的负担,部署的时候需要仔细权衡下。
3.3HTTP加密
客户端和WEB控制台之间传输数据,可以通过SSL(https)保护。SSL配置是一个推荐,但并不是必须的。
为了启用SSL,必须设置dfs.http.policy=HTTPS_ONLY或者是HTTP_AND_HTTPS(在hdfs-site.xml)。注意,这个不影响KMS和httpfs,因为它们是在tomcat之上实现,并不考虑这个参数。
如果想为KMS和httpFS启用https,请参阅kms和httpfs.
如果要启用yarn的,那么设置yarn-site.xml中的yarn.http.policy=HTTPS_ONLY。
如果要启用mapreduce 作业历史服务器的,设置mapred-site.xml中mapreduce.jobhistory.http.policy=HTTPS_ONLY
4.配置
译注:以下设置相当繁杂,对于大集群而言,是不小的工作量.本章节罗列了保全设置的各个有关配置,并给出了一些示范值(有的也是唯一的值)
有些表格的内容和容易理解,所以对一些说明不再翻译,仅仅是复制于此。
HDFS和本地文件系统的权限
表格个数的推荐的权限
Filesystem | Path | User:Group | Permissions |
---|---|---|---|
local | dfs.namenode.name.dir | hdfs:hadoop | drwx------ |
local | dfs.datanode.data.dir | hdfs:hadoop | drwx------ |
local | $HADOOP_LOG_DIR | hdfs:hadoop | drwxrwxr-x |
local | $YARN_LOG_DIR | yarn:hadoop | drwxrwxr-x |
local | yarn.nodemanager.local-dirs | yarn:hadoop | drwxr-xr-x |
local | yarn.nodemanager.log-dirs | yarn:hadoop | drwxr-xr-x |
local | container-executor | root:hadoop | --Sr-s--* |
local | conf/container-executor.cfg | root:hadoop | r-------* |
hdfs | / | hdfs:hadoop | drwxr-xr-x |
hdfs | /tmp | hdfs:hadoop | drwxrwxrwxt |
hdfs | /user | hdfs:hadoop | drwxr-xr-x |
hdfs | yarn.nodemanager.remote-app-log-dir | yarn:hadoop | drwxrwxrwxt |
hdfs | mapreduce.jobhistory.intermediate-done-dir | mapred:hadoop | drwxrwxrwxt |
hdfs | mapreduce.jobhistory.done-dir | mapred:hadoop | drwxr-x--- |
通用配置
所有节点上core-site.xml中配置
Parameter | Value | Notes |
---|---|---|
hadoop.security.authentication | kerberos | simple : No authentication. (default) kerberos : Enable authentication by Kerberos. |
hadoop.security.authorization | true | Enable RPC service-level authorization. |
hadoop.rpc.protection | authentication | authentication : authentication only (default); integrity : integrity check in addition to authentication; privacy : data encryption in addition to integrity |
hadoop.security.auth_to_local | RULE:exp1 RULE:exp2 … DEFAULT | The value is string containing new line characters. See Kerberos documentation for the format of exp. |
hadoop.proxyuser.superuser.hosts | comma separated hosts from which superuser access are allowed to impersonation. * means wildcard. | |
hadoop.proxyuser.superuser.groups | comma separated groups to which users impersonated by superuser belong. * means wildcard. |
名称节点上配置
Parameter | Value | Notes |
---|---|---|
dfs.block.access.token.enable | true | Enable HDFS block access tokens for secure operations. |
dfs.namenode.kerberos.principal | nn/_HOST@REALM.TLD | Kerberos principal name for the NameNode. |
dfs.namenode.keytab.file | /etc/security/keytab/nn.service.keytab | Kerberos keytab file for the NameNode. |
dfs.namenode.kerberos.internal.spnego.principal | HTTP/_HOST@REALM.TLD | The server principal used by the NameNode for web UI SPNEGO authentication. The SPNEGO server principal begins with the prefix HTTP/ by convention. If the value is '*', the web server will attempt to login with every principal specified in the keytab file dfs.web.authentication.kerberos.keytab. For most deployments this can be set to ${dfs.web.authentication.kerberos.principal} i.e use the value of dfs.web.authentication.kerberos.principal. |
dfs.web.authentication.kerberos.keytab | /etc/security/keytab/spnego.service.keytab | SPNEGO keytab file for the NameNode. In HA clusters this setting is shared with the Journal Nodes. |
在名称节点的web上启用SSL.
Parameter | Value | Notes |
---|---|---|
dfs.http.policy | HTTP_ONLY or HTTPS_ONLY or HTTP_AND_HTTPS | HTTPS_ONLY turns off http access. This option takes precedence over the deprecated configuration dfs.https.enable and hadoop.ssl.enabled. If using SASL to authenticate data transfer protocol instead of running DataNode as root and using privileged ports, then this property must be set to HTTPS_ONLY to guarantee authentication of HTTP servers. (See dfs.data.transfer.protection.) |
dfs.namenode.https-address | 0.0.0.0:50470 | This parameter is used in non-HA mode and without federation. See HDFS High Availability and HDFS Federation for details. |
dfs.https.enable | true | This value is deprecated. Use dfs.http.policy |
第二名称节点
Parameter | Value | Notes |
---|---|---|
dfs.namenode.secondary.http-address | 0.0.0.0:50090 | HTTP web UI address for the Secondary NameNode. |
dfs.namenode.secondary.https-address | 0.0.0.0:50091 | HTTPS web UI address for the Secondary NameNode. |
dfs.secondary.namenode.keytab.file | /etc/security/keytab/sn.service.keytab | Kerberos keytab file for the Secondary NameNode. |
dfs.secondary.namenode.kerberos.principal | sn/_HOST@REALM.TLD | Kerberos principal name for the Secondary NameNode. |
dfs.secondary.namenode.kerberos.internal.spnego.principal | HTTP/_HOST@REALM.TLD | The server principal used by the Secondary NameNode for web UI SPNEGO authentication. The SPNEGO server principal begins with the prefix HTTP/ by convention. If the value is '*', the web server will attempt to login with every principal specified in the keytab file dfs.web.authentication.kerberos.keytab. For most deployments this can be set to ${dfs.web.authentication.kerberos.principal} i.e use the value of dfs.web.authentication.kerberos.principal. |
日志节点(jns节点)
Parameter | Value | Notes |
---|---|---|
dfs.journalnode.kerberos.principal | jn/_HOST@REALM.TLD | Kerberos principal name for the JournalNode. |
dfs.journalnode.keytab.file | /etc/security/keytab/jn.service.keytab | Kerberos keytab file for the JournalNode. |
dfs.journalnode.kerberos.internal.spnego.principal | HTTP/_HOST@REALM.TLD | The server principal used by the JournalNode for web UI SPNEGO authentication when Kerberos security is enabled. The SPNEGO server principal begins with the prefix HTTP/ by convention. If the value is '*', the web server will attempt to login with every principal specified in the keytab file dfs.web.authentication.kerberos.keytab. For most deployments this can be set to ${dfs.web.authentication.kerberos.principal} i.e use the value of dfs.web.authentication.kerberos.principal. |
dfs.web.authentication.kerberos.keytab | /etc/security/keytab/spnego.service.keytab | SPNEGO keytab file for the JournalNode. In HA clusters this setting is shared with the Name Nodes. |
dfs.journalnode.https-address | 0.0.0.0:8481 | HTTPS web UI address for the JournalNode. |
数据节点
Parameter | Value | Notes |
---|---|---|
dfs.datanode.data.dir.perm | 700 | |
dfs.datanode.address | 0.0.0.0:1004 | Secure DataNode must use privileged port in order to assure that the server was started securely. This means that the server must be started via jsvc. Alternatively, this must be set to a non-privileged port if using SASL to authenticate data transfer protocol. (See dfs.data.transfer.protection.) |
dfs.datanode.http.address | 0.0.0.0:1006 | Secure DataNode must use privileged port in order to assure that the server was started securely. This means that the server must be started via jsvc. |
dfs.datanode.https.address | 0.0.0.0:50475 | HTTPS web UI address for the Data Node. |
dfs.datanode.kerberos.principal | dn/_HOST@REALM.TLD | Kerberos principal name for the DataNode. |
dfs.datanode.keytab.file | /etc/security/keytab/dn.service.keytab | Kerberos keytab file for the DataNode. |
dfs.encrypt.data.transfer | false | set to true when using data encryption |
dfs.encrypt.data.transfer.algorithm | optionally set to 3des or rc4 when using data encryption to control encryption algorithm | |
dfs.encrypt.data.transfer.cipher.suites | optionally set to AES/CTR/NoPadding to activate AES encryption when using data encryption | |
dfs.encrypt.data.transfer.cipher.key.bitlength | optionally set to 128, 192 or 256 to control key bit length when using AES with data encryption | |
dfs.data.transfer.protection | authentication : authentication only; integrity : integrity check in addition to authentication; privacy : data encryption in addition to integrity This property is unspecified by default. Setting this property enables SASL for authentication of data transfer protocol. If this is enabled, then dfs.datanode.address must use a non-privileged port, dfs.http.policy must be set to HTTPS_ONLY and the HADOOP_SECURE_DN_USER environment variable must be undefined when starting the DataNode process. |
webHDFS
Parameter | Value | Notes |
---|---|---|
dfs.web.authentication.kerberos.principal | http/_HOST@REALM.TLD | Kerberos principal name for the WebHDFS. In HA clusters this setting is commonly used by the JournalNodes for securing access to the JournalNode HTTP server with SPNEGO. |
dfs.web.authentication.kerberos.keytab | /etc/security/keytab/http.service.keytab | Kerberos keytab file for WebHDFS. In HA clusters this setting is commonly used the JournalNodes for securing access to the JournalNode HTTP server with SPNEGO. |
资源管理器
Parameter | Value | Notes |
---|---|---|
yarn.resourcemanager.principal | rm/_HOST@REALM.TLD | Kerberos principal name for the ResourceManager. |
yarn.resourcemanager.keytab | /etc/security/keytab/rm.service.keytab | Kerberos keytab file for the ResourceManager. |
yarn.resourcemanager.webapp.https.address | ${yarn.resourcemanager.hostname}:8090 | The https adddress of the RM web application for non-HA. In HA clusters, use yarn.resourcemanager.webapp.https.address.rm-id for each ResourceManager. See ResourceManager High Availability for details. |
节点管理器
Parameter | Value | Notes |
---|---|---|
yarn.nodemanager.principal | nm/_HOST@REALM.TLD | Kerberos principal name for the NodeManager. |
yarn.nodemanager.keytab | /etc/security/keytab/nm.service.keytab | Kerberos keytab file for the NodeManager. |
yarn.nodemanager.container-executor.class | org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor | Use LinuxContainerExecutor. |
yarn.nodemanager.linux-container-executor.group | hadoop | Unix group of the NodeManager. |
yarn.nodemanager.linux-container-executor.path | /path/to/bin/container-executor | The path to the executable of Linux container executor. |
yarn.nodemanager.webapp.https.address | 0.0.0.0:8044 | The https adddress of the NM web application. |
web应用代理配置
Parameter | Value | Notes |
---|---|---|
yarn.web-proxy.address | WebAppProxy host:port for proxy to AM web apps. | host:port if this is the same as yarn.resourcemanager.webapp.address or it is not defined then the ResourceManager will run the proxy otherwise a standalone proxy server will need to be launched. |
yarn.web-proxy.keytab | /etc/security/keytab/web-app.service.keytab | Kerberos keytab file for the WebAppProxy. |
yarn.web-proxy.principal | wap/_HOST@REALM.TLD | Kerberos principal name for the WebAppProxy. |
LinuxContainerExecutor
这种东西为YARN所用,作用是定义容器如何启动和控制
ContainerExecutor | Description |
---|---|
DefaultContainerExecutor | The default executor which YARN uses to manage container execution. The container process has the same Unix user as the NodeManager. |
LinuxContainerExecutor | Supported only on GNU/Linux, this executor runs the containers as either the YARN user who submitted the application (when full security is enabled) or as a dedicated user (defaults to nobody) when full security is not enabled. When full security is enabled, this executor requires all user accounts to be created on the cluster nodes where the containers are launched. It uses a setuid executable that is included in the Hadoop distribution. The NodeManager uses this executable to launch and kill containers. The setuid executable switches to the user who has submitted the application and launches or kills the containers. For maximum security, this executor sets up restricted permissions and user/group ownership of local files and directories used by the containers such as the shared objects, jars, intermediate files, log files etc. Particularly note that, because of this, except the application owner and NodeManager, no other user can access any of the local files/directories including those localized as part of the distributed cache. |
使用以下语句构建可运行的容器:
$ mvn package -Dcontainer-executor.conf.dir=/etc/hadoop/
完成之后,可执行的内容必须放在$HADOOP_YARN_HOME/bin,其次这个可执行的程序必须授权6050,且其属主通常是hadoop(应用的用户不能属于这个组)。这个主的属性必须通过cong/yarn-site.xml,conf/container-executor.cfg的yarn.nodemanager.linux-container-executor.group中配置。
例如:如果使用yarn这个用户运行nodemanager(yarn属于组user和hadoop),那么users和hadoop都是主要的组。user组还有其它用户alice,但alice不属于hadoop.
conf/container-executor.cfg中配置
Parameter | Value | Notes |
---|---|---|
yarn.nodemanager.linux-container-executor.group | hadoop | Unix group of the NodeManager. The group owner of the container-executor binary should be this group. Should be same as the value with which the NodeManager is configured. This configuration is required for validating the secure access of the container-executor binary. |
banned.users | hdfs,yarn,mapred,bin | Banned users. |
allowed.system.users | foo,bar | Allowed system users. |
min.user.id | 1000 | Prevent other super-users. |
和Linuxcontaierexecutor有关的路径权限
Filesystem | Path | User:Group | Permissions |
---|---|---|---|
local | container-executor | root:hadoop | --Sr-s--* |
local | conf/container-executor.cfg | root:hadoop | r-------* |
local | yarn.nodemanager.local-dirs | yarn:hadoop | drwxr-xr-x |
local | yarn.nodemanager.log-dirs | yarn:hadoop | drwxr-xr-x |
MR作业历史服务器
Parameter | Value | Notes |
---|---|---|
mapreduce.jobhistory.address | MapReduce JobHistory Server host:port | Default port is 10020. |
mapreduce.jobhistory.keytab | /etc/security/keytab/jhs.service.keytab | Kerberos keytab file for the MapReduce JobHistory Server. |
mapreduce.jobhistory.principal | jhs/_HOST@REALM.TLD | Kerberos principal name for the MapReduce JobHistory Server. |
5.多主连接
即一个地址对应多个主机名称,这个通常通过dns来实现(译注:这个就是开头的时候原文推荐的缘故)。
这种多主连接导致kerberos配置会变复杂一些。具体参考多主网络
译注:基本上很少人使用这个。
6.疑难解答
kerberos很难配置-很难调试。常见问题是:
1.网络和DNS配置
2.kerberos配置(/etc/krb5.conf)
3.keytab的创建和维护
4.环境设置:JVM,用户登录和系统锁等等
事实是,jvm发送的错误信息毫无用处。(译注:原文也开始吐槽了)
可以设置环境变量(客户端和服务端)HADOOP_JAAS_DEBUG=true
修改hadoop的log4j.properties,设定 log4j.logger.org.apache.hadoop.security=DEBUG
设定系统属性 export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true -Dsun.security.krb5.debug=true -Dsun.security.spnego.debug" ,这样可以启用JVM级别的调试。
译注:是否启用 kerberos,是比较头疼的一件事情。 但安全不是只有一种方法。可以构建hadoop的孤立网络,然后通过一些接口机器来隔离特别网络。
目前的kerberos配置过于复杂,让人望而生畏,尤其在一个大的hadoop集群上。如果是几台机器,这样的集群作用又相对有限。
或许将来的版本中,hadoop应该考虑学习windows的认证方式,大大减轻系统管理员的压力。
7.使用kDiag进行疑难解答
hadoop提供了一个工具Kdiag,用于帮助配置。
这个工具包含了一系列的探测器,用于解决JVM配置和环境,并可以导出一些(dump)系统文件(例如/etc/krb5.conf ,/etc/ntp.conf),打印一些系统状态,并试图以当前用户的身份在kerberos中记录日志,也可以是以特定keytab中主体身份。
命令的输出可以用于本地诊断,或者提交给有关的人,例如集群管理员或者其它人。
KDiag命令有它自己的入口;当前并非通过hook的方式连接到终端用户CLI上。
通过简单的在hadoop,hdfs,yarn命令后带上全类名即可。
hadoop org.apache.hadoop.security.KDiag
hdfs org.apache.hadoop.security.KDiag
yarn org.apache.hadoop.security.KDiag
如果返回0,表示成功运行-但仅仅表示这个类运行成功,不表示kerberos可用,仅仅表示KDiag的多个探测是成功的。它不连接到任何远程的服务,也不确认客户端是否被任意服务信任。
如果失败:
- -1: 失败,原因未知。
- 41: 未授权 (== HTTP’s 401).
7.1KDiag的用法
KDiag: Diagnose Kerberos Problems
[-D key=value] : Define a configuration option.
[--jaas] : Require a JAAS file to be defined in java.security.auth.login.config.
[--keylen <keylen>] : Require a minimum size for encryption keys supported by the JVM. Default value : 256.
[--keytab <keytab> --principal <principal>] : Login from a keytab as a specific principal.
[--nofail] : Do not fail on the first problem.
[--nologin] : Do not attempt to log in.
[--out <file>] : Write output to a file.
[--resource <resource>] : Load an XML configuration resource.
[--secure] : Require the hadoop configuration to be secure.
[--verifyshortname <principal>]: Verify the short name of the specific principal does not contain '@' or '/'
--jaas
java.security.auth.login.config必须指向一个JAAS文件。文件必须存在,只要非0字节即可,当前用户可读。hadoop自身不用这个文件,但zookeeper需要用这个做安全操作。
--keylen
如果JVM不支持这个长度,那么会失败。
默认的密匙长度是256。JVM必须安装JAVA Cryptography Extensions,否则无法支持这样的长度。或者配置一个比较短的长度,也可以的。
--keytab
以keytab中定义的特定主体登录。
- keytab文件必须包含特定的主体,包括任意的命名主机。 换言之,不需要用_HOST来映射当前的主机名称。
- Ddiag会登出,并试图再次登入。这样可以捕获一些过去的兼容性问题。
--nofail
KDiag如果遇到第一个错误,那么不会停止。但局限性,可能第一个问题就是关键问题。但这样可以提供更多更详细的报告。
--nologin
不登录。 意味着跳过keytab选项,也不以kinit用户登入。
这仅仅是为了检查一些基本的kerberos问题,例如一些kereros基本配置信息。
--out
示例: hadoop org.apache.hadoop.security.KDiag --out out.txt
原文建议我们按照以下方式来重定向Jre(stderr)和log4j(stdout)的输出到同一个文件:
hadoop org.apache.hadoop.security.KDiag --keytab zk.service.keytab --principal zookeeper/devix.example.org@REALM > out.txt 2>&1
--resource
当使用hdfs和yarn命令的时候,这可以用于强制加载hdfs-site.xml和yarn-site.xml资源文件,以便获取kerberos有关的配置。core-site.xml的配置总是会加载。
hdfs org.apache.hadoop.security.KDiag --resource hbase-default.xml --resource hbase-site.xml
yarn org.apache.hadoop.security.KDiag --resource yarn-default.xml --resource yarn-site.xml
--secure
检查下面这个配置:
<property>
<name>hadoop.security.authentication</name>
<value>simple</value>
</property>
Needless to say, an application so configured cannot talk to a secure Hadoop cluster. --译注:原文说得很有意思,就不翻译了。总有一些粗心的人忘了这个配置,而这种情况也不少见。
--verifyshortname
验证主体中的短称,这个主体不包含@和/字符。
示例
hdfs org.apache.hadoop.security.KDiag \
--nofail \
--resource hbase-default.xml --resource hbase-site.xml \
--keylen 1024 \
--keytab zk.service.keytab --principal zookeeper/devix.example.org@REALM
8.参考
- O’Malley O et al. Hadoop Security Design
- O’Malley O, Hadoop Security Architecture
- Troubleshooting Kerberos on Java 7
- Troubleshooting Kerberos on Java 8
- Java 7 Kerberos Requirements
- Java 8 Kerberos Requirements
- Loughran S., Hadoop and Kerberos: The Madness beyond the Gate
构建高可靠hadoop集群之4-保全模式的更多相关文章
- 构建高可靠hadoop集群之3- Quorum Journal Manager
在正式环境中,搭建高可靠(ha)的系统是必须的. 例如oralce的rac,apache集群,windows服务器集群 本文不再赘言ha的重要性. 本文主要是对 http://hadoop.apach ...
- 构建高可靠hadoop集群之0-hadoop用户向导
本文翻译自:http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html 基于2.8. ...
- 构建高可靠hadoop集群之1-理解hdfs架构
本文主要参考 http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html 主要内容是对该文 ...
- 构建高可靠hadoop集群之5-服务级别授权
本人翻译自: http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/ServiceLevelAuth.html ...
- 构建高可靠hadoop集群之4-权限指引
此文翻译自http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html ...
- 构建高可靠hadoop集群之2-机栈
本文主要参考 http://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/RackAwareness.html had ...
- Dubbo+zookeeper构建高可用分布式集群(二)-集群部署
在Dubbo+zookeeper构建高可用分布式集群(一)-单机部署中我们讲了如何单机部署.但没有将如何配置微服务.下面分别介绍单机与集群微服务如何配置注册中心. Zookeeper单机配置:方式一. ...
- .net core下简单构建高可用服务集群
一说到集群服务相信对普通开发者来说肯定想到很复杂的事情,如zeekeeper ,反向代理服务网关等一系列的搭建和配置等等:总得来说需要有一定经验和规划的团队才能应用起来.在这文章里你能看到在.net ...
- 构建高可用ZooKeeper集群
ZooKeeper 是 Apache 的一个顶级项目,为分布式应用提供高效.高可用的分布式协调服务,提供了诸如数据发布/订阅.负载均衡.命名服务.分布式协调/通知和分布式锁等分布式基础服务.由于 Zo ...
随机推荐
- Jupyter Notebook(iPython)
一.Jupyter Notebook介绍 1.什么是Jupyter Notebook Jupyter Notebook是基于网页的用于交互计算的应用程序.其可被应用于全过编码开发.文档编写.运行代码和 ...
- 随学笔记 partAdded
随学笔记: RectangularDropShadow为矩形对象添加阴影,DropShadowFilter可以为任意形状对象添加阴影. BorderContainer和Panel等容器使用的就是Rec ...
- 【Linux】IPC-消息队列
问题 消息队列id 和键值KEY区别? 首先要注意一个概念:IPC结构都是内核的结构.也就是说IPC结构由内核维护,对于每个进程都是公共的,不属于某个特定进程.只有这样,IPC结构才能支持它们&quo ...
- Thread类的sleep()方法和对象的wait()方法都可以让线程暂停执行,它们有什么区别? 线程的sleep()方法和yield()方法有什么区别?
Thread类的sleep()方法和对象的wait()方法都可以让线程暂停执行,它们有什么区别? sleep()方法(休眠)是线程类(Thread)的静态方法,调用此方法会让当前线程暂停执行指定的时间 ...
- jQuery Event.stopImmediatePropagation() 函数详解
stopImmediatePropagation()函数用于阻止剩余的事件处理函数的执行,并防止当前事件在DOM树上冒泡. 根据DOM事件流机制,在元素上触发的大多数事件都会冒泡传递到该元素的所有祖辈 ...
- phpunit 单元测试之代码覆盖率
最近团队在不断完善项目中的单元测试用例,会用到代码覆盖率分析,本来以为 homestead 应该默认安装了 xdebug ,所以使用 phpunit --coverage-html ./tests/c ...
- ASP.NET向MySQL写入中文的乱码问题-.NET技术/C#
1,在 mysql数据库安装目录下找到my.ini文件,把default-character-set的值修改为 default-character-set=gb2312(修改两处),保存,重新启动. ...
- 查看oracle固定目录下日志和trace文件大小脚本
python刚入门,在Oracle官网看到个小脚本,感觉挺有意思,经过测试切实可行. [oracle@ycr python]$ more 5.py import datetimeimport osim ...
- March 10 2017 Week 10 Friday
If you love life, life will love you back. 爱生活,生活也会爱你. Love life, and it will love you back. All thi ...
- flume-ng 自定义sink消费flume source
如何从一个已经存在的Flume source消费数据 1.下载flume wget http://www.apache.org/dist/flume/stable/apache-flume-1.5.2 ...