Cloudera CDH 5集群搭建(yum 方式)
1 集群环境
主节点
master001 ~~ master006
从节点
slave001 ~~ slave064
2 安装CDH5的YUM源
rpm -Uvhhttp://archive.cloudera.com/cdh5/one-click-install/redhat/6/x86_64/cloudera-cdh-5-0.x86_64.rpm
或
wgethttp://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/cloudera-cdh5.repo
mv cloudera-cdh5.repo /ect/yum.repo.d/
3 ZooKeeper
3.1 节点分配
ZooKeeperServer :
master002,master003, master004, master005, master006
ZooKeeperClient :
master001,master002, master003, master004, master005, master006
3.2 安装
ZooKeeper Client节点:
yum install -y zookeeper
ZooKeeper Server节点:
yum install -y zookeeper-server
3.3 配置
1.zookeeper节点改动zookeeper配置文件
/etc/zookeeper/conf/zoo.cfg
maxClientCnxns=50
# Thenumber of milliseconds of each tick
tickTime=2000
# Thenumber of ticks that the initial
#synchronization phase can take
initLimit=10
# Thenumber of ticks that can pass between
# sendinga request and getting an acknowledgement
syncLimit=5
# thedirectory where the snapshot is stored.
dataDir=/data/disk01/zookeeper/zk_data
dataLogDir=/data/disk01/zookeeper/zk_log
# theport at which the clients will connect
clientPort=2181
server.2=master002:2888:3888
server.3=master003:2888:3888
server.4=master004:2888:3888
server.5=master005:2888:3888
server.6=master006:2888:3888
2.初始化节点
master002:
service zookeeper-server init --myid=2
master003:
service zookeeper-server init --myid=3
master004:
service zookeeper-server init --myid=4
master005:
service zookeeper-server init --myid=5
master006:
service zookeeper-server init --myid=6
3.执行zookeeper
service zookeeper-server start
3.4 安装路径
程序路径
/usr/lib/zookeeper/
配置文件路径
/etc/zookeeper/conf
日志路径
/var/log/zookeeper
3.5 执行|关闭|查看状态
ZooKeeper
service zookeeper-server start|stop|status
3.6 经常使用命令
查看ZooKeeper节点状态
zookeeper-server status
手动清理日志
/usr/lib/zookeeper/bin/zkCleanup.shdataLogDir [snapDir] -n count
自己主动清理日志
autopurge.purgeInterval 这个參数指定了清理频率,单位是小时,须要填写一个1或更大的整数,默认是0,表示不开启自己清理功能。
autopurge.snapRetainCount 这个參数和上面的參数搭配使用,这个參数指定了须要保留的文件数目。默认是保留3个。
3.7 測试
https://github.com/phunt/zk-smoketest
3.8 參考文献
ZooKeeper參数配置
http://my.oschina.net/u/128568/blog/194820
ZooKeeper常见管理和运维
http://nileader.blog.51cto.com/1381108/1032157
4 HDFS
4.1 节点分配(配置NN HA)
namenode、zkfc:
master002, master003
datanode:
slave001-slave064
journalnode:
master002, master003, master004
4.2 安装
namenode:
yum install hadoop-hdfs-namenode
yum install hadoop-hdfs-zkfc
(yum install -y hadoop-hdfs-namenodehadoop-hdfs-zkfc hadoop-client)
datanode:
yum install hadoop-hdfs-datanode
(yum install -y hadoop-hdfs-datanodehadoop-client)
journalnode:
yum install hadoop-hdfs-journalnode
(yum install -y hadoop-hdfs-journalnode)
全部节点:
yum install hadoop-client
4.3 配置
1.配置文件
/etc/hadoop/conf/core-site.xml
<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://bdcluster</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>1440</value>
</property>
<property>
<name>hadoop.proxyuser.httpfs.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.httpfs.groups</name>
<value>*</value>
</property>
</configuration>
/etc/hadoop/conf/hdfs-site.xml
<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.nameservices</name>
<value>bdcluster</value>
</property>
<property>
<name>dfs.ha.namenodes.bdcluster</name>
<value>nn002,nn003</value>
</property>
<property>
<name>dfs.namenode.rpc-address.bdcluster.nn002</name>
<value>master002:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.bdcluster.nn003</name>
<value>master003:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.bdcluster.nn002</name>
<value>master002:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.bdcluster.nn003</name>
<value>master003:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://master002:8485;master003:8485;master004:8485/bdcluster</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/data/disk01/hadoop/hdfs/journalnode</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.bdcluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/var/lib/hadoop-hdfs/.ssh/id_dsa</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>master002:2181,master003:2181,master004:2181,master005:2181,master006:2181</value>
</property>
<property>
<name>dfs.permissions.superusergroup</name>
<value>hadoop</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/data/disk01/hadoop/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/data/disk01/hadoop/hdfs/datanode,/data/disk02/hadoop/hdfs/datanode,/data/disk03/hadoop/hdfs/datanode,/data/disk04/hadoop/hdfs/datanode,/data/disk05/hadoop/hdfs/datanode,/data/disk06/hadoop/hdfs/datanode,/data/disk07/hadoop/hdfs/datanode</value>
</property>
<property>
<name>dfs.datanode.failed.volumes.tolerated</name>
<value>3</value>
</property>
<property>
<name>dfs.datanode.max.xcievers</name>
<value>4096</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
/etc/hadoop/conf/slaves
slave001
slave002
…
slave064
2.配置hdfs用户的免password登陆
3.创建数据文件夹
namenode
mkdir -p/data/disk01/hadoop/hdfs/namenode
chown -Rhdfs:hdfs /data/disk01/hadoop/hdfs/
chown -Rhdfs:hdfs /data/disk01/hadoop/hdfs/namenode
chmod 700/data/disk01/hadoop/hdfs/namenode
datanode
mkdir -p/data/disk01/hadoop/hdfs/datanode
chmod 700/data/disk01/hadoop/hdfs/datanode
chown -Rhdfs:hdfs /data/disk01/hadoop/hdfs/
mkdir -p/data/disk02/hadoop/hdfs/datanode
chmod 700/data/disk02/hadoop/hdfs/datanode
chown -Rhdfs:hdfs /data/disk02/hadoop/hdfs/
mkdir -p/data/disk03/hadoop/hdfs/datanode
chmod 700/data/disk03/hadoop/hdfs/datanode
chown -Rhdfs:hdfs /data/disk03/hadoop/hdfs/
mkdir -p/data/disk04/hadoop/hdfs/datanode
chmod 700/data/disk04/hadoop/hdfs/datanode
chown -Rhdfs:hdfs /data/disk04/hadoop/hdfs/
mkdir -p/data/disk05/hadoop/hdfs/datanode
chmod 700/data/disk05/hadoop/hdfs/datanode
chown -Rhdfs:hdfs /data/disk05/hadoop/hdfs/
mkdir -p/data/disk06/hadoop/hdfs/datanode
chmod 700/data/disk06/hadoop/hdfs/datanode
chown -Rhdfs:hdfs /data/disk06/hadoop/hdfs/
mkdir -p/data/disk07/hadoop/hdfs/datanode
chmod 700/data/disk07/hadoop/hdfs/datanode
chown -Rhdfs:hdfs /data/disk07/hadoop/hdfs/
journalnode
mkdir -p/data/disk01/hadoop/hdfs/journalnode
chown -Rhdfs:hdfs /data/disk01/hadoop/hdfs/journalnode
4.启动journalnode
service hadoop-hdfs-journalnode start
5.格式化namenode(master002)
sudo -u hdfs hadoop namenode -format
6.在ZooKeeper中初始化HA状态(namenodemaster002)
hdfs zkfc -formatZK
7.初始化Shared Editsdirectory(master002)
hdfs namenode -initializeSharedEdits
8.启动namenode
formatted namenode(master002):
service hadoop-hdfs-namenode start
standby namenode(master003):
sudo -u hdfs hdfs namenode-bootstrapStandby
service hadoop-hdfs-namenode start
9.启动datanode
service hadoop-hdfs-datanode start
10.启动zkfc(namenode)
service hadoop-hdfs-zkfc start
11.初始化HDFS文件夹
/usr/lib/hadoop/libexec/init-hdfs.sh
4.4 安装路径
程序路径
/usr/lib/hadoop-hdfs
配置文件路径
/etc/hadoop/conf
日志路径
/var/log/hadoop-hdfs
4.5 执行|关闭|查看状态
NameNode
service hadoop-hdfs-namenodestart|stop|status
DataNode
service hadoop-hdfs-datanodestart|stop|status
JournalNode
service hadoop-hdfs-journalnodestart|stop|status
zkfc
service hadoop-hdfs-zkfc start|stop|status
4.6 经常使用命令
查看集群状态
sudo -u hdfs hdfs dfsadmin -report
检查文件及其副本
sudo -u hdfs hdfs fsck [文件名称] -files-blocks -locations –racks
5 YARN
5.1 节点分配
resourcemanager:
master004
nodemanager、mapreduce:
slave001-slave064
mapreduce-historyserver:
master006
5.2 安装
resourcemanager:
yum -y install hadoop-yarn-resourcemanager
nodemanager:
yum -y install hadoop-yarn-nodemanagerhadoop-mapreduce
mapreduce-historyserver:
yum -y installhadoop-mapreduce-historyserver hadoop-yarn-proxyserver
全部节点
yum -y install hadoop-client
5.3 配置
1.配置文件
/etc/hadoop/conf/mapred-site.xml
<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>1024</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-XX:-UseGCOverheadLimit-Xms1024m -Xmx2048m</value>
</property>
<property>
<name>yarn.app.mapreduce.am.command-opts</name>
<value>-Xmx2048m</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master006:10020</value>
<description>MapReduce JobHistoryServer IPC host:port</description>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master006:19888</value>
<description>MapReduce JobHistoryServer Web UI host:port</description>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>2048</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>4096</value>
</property>
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>/user/history/done_intermediate</value>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>/user/history/done</value>
</property>
</configuration>
/etc/hadoop/conf/yarn-site.xml
<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master004:8031</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master004:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master004:8030</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master004:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master004:8088</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<description>List of directories tostore localized files in.</description>
<name>yarn.nodemanager.local-dirs</name>
<value>/data/disk01/hadoop/yarn/local,/data/disk02/hadoop/yarn/local, /data/disk03/hadoop/yarn/local,/data/disk04/hadoop/yarn/local, /data/disk05/hadoop/yarn/local</value>
</property>
<property>
<description>Where to store containerlogs.</description>
<name>yarn.nodemanager.log-dirs</name>
<value>/data/disk01/hadoop/yarn/logs,/data/disk02/hadoop/yarn/logs, /data/disk03/hadoop/yarn/logs,/data/disk04/hadoop/yarn/logs, /data/disk05/hadoop/yarn/logs</value>
</property>
<!--property>
<description>Where to aggregate logsto.</description>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/var/log/hadoop-yarn/apps</value>
</property-->
<property>
<description>Classpath for typicalapplications.</description>
<name>yarn.application.classpath</name>
<value>
$HADOOP_CONF_DIR,
$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,
$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,
$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,
$HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*
</value>
</property>
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/user</value>
</property>
<property>
<description>The minimum allocationfor every container request at the RM,
in MBs. Memory requests lower than thiswon't take effect,
and the specified value will get allocatedat minimum.</description>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>
<property>
<description>The maximum allocationfor every container request at the RM,
in MBs. Memory requests higher than thiswon't take effect,
and will get capped to thisvalue.</description>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>16384</value>
</property>
<property>
<description>The minimum allocationfor every container request at the RM,
in terms of virtual CPU cores. Requestslower than this won't take effect,
and the specified value will get allocatedthe minimum.</description>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>1</value>
</property>
<property>
<description>The maximum allocationfor every container request at the RM,
in terms of virtual CPU cores. Requestshigher than this won't take effect,
and will get capped to thisvalue.</description>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>32</value>
</property>
<property>
<description>Number of CPU cores thatcan be allocated
for containers.</description>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>48</value>
</property>
<property>
<description>Amount of physicalmemory, in MB, that can be allocated
for containers.</description>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>120000</value>
</property>
<property>
<description>Ratio between virtualmemory to physical memory when
setting memory limits for containers.Container allocations are
expressed in terms of physical memory, andvirtual memory usage
is allowed to exceed this allocation bythis ratio.
</description>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>6</value>
</property>
</configuration>
2. nodemanager创建本地文件夹
mkdir -p/data/disk01/hadoop/yarn/local /data/disk02/hadoop/yarn/local/data/disk03/hadoop/yarn/local /data/disk04/hadoop/yarn/local/data/disk05/hadoop/yarn/local
mkdir -p/data/disk01/hadoop/yarn/logs /data/disk02/hadoop/yarn/logs/data/disk03/hadoop/yarn/logs /data/disk04/hadoop/yarn/logs/data/disk05/hadoop/yarn/logs
chown -Ryarn:yarn /data/disk01/hadoop/yarn /data/disk02/hadoop/yarn/local/data/disk03/hadoop/yarn /data/disk04/hadoop/yarn /data/disk05/hadoop/yarn
chown -Ryarn:yarn /data/disk01/hadoop/yarn/local /data/disk02/hadoop/yarn/local/data/disk03/hadoop/yarn/local /data/disk04/hadoop/yarn/local/data/disk05/hadoop/yarn/local
chown -Ryarn:yarn /data/disk01/hadoop/yarn/logs /data/disk02/hadoop/yarn/logs/data/disk03/hadoop/yarn/logs /data/disk04/hadoop/yarn/logs/data/disk05/hadoop/yarn/logs
3. 创建history文件夹
sudo -u hdfs hadoop fs -mkdir /user/history
sudo -u hdfs hadoop fs -chmod -R 1777/user/history
sudo -u hdfs hadoop fs -chown yarn/user/history
4. 启动服务
resourcemanager:
sudo service hadoop-yarn-resourcemanagerstart
nodemanager:
sudo service hadoop-yarn-nodemanager start
mapreduce-historyserver:
sudo service hadoop-mapreduce-historyserverstart
5.4 安装路径
程序路径
/usr/lib/hadoop-yarn
配置文件路径
/etc/hadoop/conf
日志路径
/var/log/hadoop-yarn
5.5 执行|关闭|查看状态
resourcemanager:
service hadoop-yarn-resourcemanagerstart|stop|status
nodemanager:
service hadoop-yarn-nodemanagerstart|stop|status
mapreduce-historyserver:
service hadoop-mapreduce-historyserverstart|stop|status
Edit
5.6 经常使用命令
查看节点状态
yarn node -list -all
resourcemanager管理
yarm rmadmin ...
6 HBase
6.1 节点分配
hbase-master
master004, master005, master006
hbase-regionserver
slave001 ~~ 064
hbase-thrift
master004, master005, master006
hbase-rest
master004, master005, master006
6.2 安装
hbase-master
yum install -y hbase hbase-master
hbase-regionserver
yum install -y hbase hbase-regionserver
hbase-thrift
yum install -y hbase-thrift
hbase-rest
yum install -y hbase-rest
6.3 配置
1.配置文件
/etc/security/limits.conf
hdfs -nofile 32768
hbase -nofile 32768
/etc/hbase/conf/hbase-site.xml
<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hbase.rest.port</name>
<value>60050</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master002, master003,master004, master005,master006</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>/tmp/hadoop/hbase</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://bdcluster/hbase/</value>
</property>
</configuration>
/etc/hbase/conf/hbase-env.sh
# Setenvironment variables here.
# Thisscript sets variables multiple times over the course of starting an hbaseprocess,
# so tryto keep things idempotent unless you want to take an even deeper look
# intothe startup scripts (bin/hbase, etc.)
# Thejava implementation to use. Java 1.6required.
# exportJAVA_HOME=/usr/java/default/
# ExtraJava CLASSPATH elements. Optional.
# exportHBASE_CLASSPATH=
# Themaximum amount of heap to use, in MB. Default is 1000.
# exportHBASE_HEAPSIZE=1000
# ExtraJava runtime options.
# Beloware what we set by default. May onlywork with SUN JVM.
# Formore on why as well as other possible settings,
# seehttp://wiki.apache.org/hadoop/PerformanceTuning
exportHBASE_OPTS="-XX:+UseConcMarkSweepGC"
#Uncomment one of the below three options to enable java garbage collectionlogging for the server-side processes.
# Thisenables basic gc logging to the .out file.
# exportSERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails-XX:+PrintGCDateStamps"
exportSERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps$HBASE_GC_OPTS"
exportSERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M$HBASE_GC_OPTS"
# Thisenables basic gc logging to its own file.
# IfFILE-PATH is not replaced, the log file(.gc) would still be generated in theHBASE_LOG_DIR .
# exportSERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps-Xloggc:<FILE-PATH>"
# Thisenables basic GC logging to its own file with automatic log rolling. Onlyapplies to jdk 1.6.0_34+ and 1.7.0_2+.
# IfFILE-PATH is not replaced, the log file(.gc) would still be generated in theHBASE_LOG_DIR .
# exportSERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps-Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1-XX:GCLogFileSize=512M"
#Uncomment one of the below three options to enable java garbage collectionlogging for the client processes.
# Thisenables basic gc logging to the .out file.
# exportCLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails-XX:+PrintGCDateStamps"
exportCLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps$HBASE_GC_OPTS"
# Thisenables basic gc logging to its own file.
# IfFILE-PATH is not replaced, the log file(.gc) would still be generated in theHBASE_LOG_DIR .
# exportCLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps-Xloggc:<FILE-PATH>"
# Thisenables basic GC logging to its own file with automatic log rolling. Onlyapplies to jdk 1.6.0_34+ and 1.7.0_2+.
# IfFILE-PATH is not replaced, the log file(.gc) would still be generated in theHBASE_LOG_DIR .
# exportCLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps-Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1-XX:GCLogFileSize=512M"
#Uncomment below if you intend to use the EXPERIMENTAL off heap cache.
# exportHBASE_OPTS="$HBASE_OPTS -XX:MaxDirectMemorySize="
# Sethbase.offheapcache.percentage in hbase-site.xml to a nonzero value.
exportHBASE_USE_GC_LOGFILE=true
#Uncomment and adjust to enable JMX exporting
# Seejmxremote.password and jmxremote.access in $JRE_HOME/lib/management toconfigure remote password access.
# Moredetails at:http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html
#
# exportHBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false-Dcom.sun.management.jmxremote.authenticate=false"
# exportHBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE-Dcom.sun.management.jmxremote.port=10101"
# exportHBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE-Dcom.sun.management.jmxremote.port=10102"
# exportHBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE-Dcom.sun.management.jmxremote.port=10103"
# exportHBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE-Dcom.sun.management.jmxremote.port=10104"
# exportHBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10105"
# Filenaming hosts on which HRegionServers will run. $HBASE_HOME/conf/regionservers by default.
# exportHBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers
#Uncomment and adjust to keep all the Region Server pages mapped to be memoryresident
#HBASE_REGIONSERVER_MLOCK=true
#HBASE_REGIONSERVER_UID="hbase"
# Filenaming hosts on which backup HMaster will run. $HBASE_HOME/conf/backup-masters by default.
# exportHBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters
# Extrassh options. Empty by default.
# exportHBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR"
# Wherelog files are stored. $HBASE_HOME/logsby default.
# exportHBASE_LOG_DIR=${HBASE_HOME}/logs
# Enableremote JDWP debugging of major HBase processes. Meant for Core Developers
# exportHBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"
# exportHBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"
# exportHBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"
# exportHBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"
# Astring representing this instance of hbase. $USER by default.
# exportHBASE_IDENT_STRING=$USER
# Thescheduling priority for daemon processes. See 'man nice'.
# exportHBASE_NICENESS=10
# Thedirectory where pid files are stored. /tmp by default.
# exportHBASE_PID_DIR=/var/hadoop/pids
# Secondsto sleep between slave commands. Unsetby default. This
# can beuseful in large clusters, where, e.g., slave rsyncs can
#otherwise arrive faster than the master can service them.
# exportHBASE_SLAVE_SLEEP=0.1
# TellHBase whether it should manage it's own instance of Zookeeper or not.
exportHBASE_MANAGES_ZK=false
# Thedefault log rolling policy is RFA, where the log file is rolled as per the sizedefined for the
# RFAappender. Please refer to the log4j.properties file to see more details on thisappender.
# In caseone needs to do log rolling on a date change, one should set the environmentproperty
#HBASE_ROOT_LOGGER to "<DESIRED_LOG LEVEL>,DRFA".
# Forexample:
#HBASE_ROOT_LOGGER=INFO,DRFA
# Thereason for changing default to RFA is to avoid the boundary case of filling outdisk space as
# DRFAdoesn't put any cap on the log size. Please refer to HBase-5655 for morecontext.
2. 启动
hbase-master
service hbase-master start
hbase-regionserver
service hbase-regionserver start
hbase-thrift
service hbase-thrift start
hbase-rest
service hbase-rest start
6.4 安装路径
安装路径
/usr/lib/hbase
配置文件路径
/etc/hbase/conf
日志路径
/var/log/hbase
6.5 执行|关闭|查看状态
hbase-master:
service hbase-master start|stop|status
hbase-regionserver:
service hbase-regionserverstart|stop|status
hbase-thrift:
service hbase-thrift start|stop|status
hbase-rest:
service hbase-rest start|stop|status
6.6 经常使用命令
hbase shell
7 Spark
7.1 节点分配
master002 ~~ master006
7.2 安装
yum install spark-core spark-masterspark-worker spark-python
7.3 配置
1. /etc/spark/conf/spark-env.sh
export SPARK_HOME=/usr/lib/spark
2. 部署Spark到HDFS
source /etc/spark/conf/spark-env.sh
hdfs dfs -mkdir -p /user/spark/share/lib
sudo -u hdfs hdfs dfs -put/usr/lib/spark/assembly/lib/spark-assembly_2.10-0.9.0-cdh5.0.0-hadoop2.3.0-cdh5.0.0.jar/user/spark/share/lib/spark-assembly.jar
7.4 安装路径
程序路径
/usr/lib/spark
配置文件路径
/etc/spark/conf
日志路径
/var/log/spark
spark在hdfs的路径
/user/spark/share/lib/spark-assembly.jar
7.5 演示样例程序
source /etc/spark/conf/spark-env.sh
SPARK_JAR=hdfs://bdcluster/user/spark/share/lib/spark-assembly.jarAPP_JAR=$SPARK_HOME/examples/lib/spark-examples_2.10-0.9.0-cdh5.0.0.jar$SPARK_HOME/bin/spark-class org.apache.spark.deploy.yarn.Client --jar $APP_JAR--class org.apache.spark.examples.SparkPi
--args yarn-standalone --args 10
Cloudera CDH 5集群搭建(yum 方式)的更多相关文章
- 3.环境搭建-Hadoop(CDH)集群搭建
目录 目录 实验环境 安装 Hadoop 配置文件 在另外两台虚拟机上搭建hadoop 启动hdfs集群 启动yarn集群 本文主要是在上节CentOS集群基础上搭建Hadoop集群. 实验环境 Ha ...
- Redis Cluster集群搭建与应用
1.redis-cluster设计 Redis集群搭建的方式有多种,例如使用zookeeper,但从redis 3.0之后版本支持redis-cluster集群,redis-cluster采用无中心结 ...
- centos7下Etcd3集群搭建
一.环境介绍 etcd主要功能是分布式的存储键值,优点不多说了,分布是集群,自动选举等等,自行百度,主要说下配置方法,折腾了几天,终于优点眉目了,记录下操作方法,本文参考了如下链接 https://w ...
- Redis Cluster集群搭建<原>
一.环境配置 一台window 7上安装虚拟机,虚拟机中安装的是centos系统. 二.目标 Redis集群搭建的方式有多种,根据集群逻辑的位置,大致可以分为三大类:基于客户端分片的Redis ...
- 分享知识-快乐自己:redis集群搭建
Redis介绍: 1.开源的NoSql数据库 2.C语言编写 3.基于内存运行,并且支持持久化 4.Key value存储 5.是主流的Nosql数据库之一 Redis优点: 1.内存使用方面,表现优 ...
- Redis Cluster集群详介绍和伪集群搭建
1 什么是Redis-Cluster 为何要搭建Redis集群.Redis是在内存中保存数据的,而我们的电脑一般内存都不大,这也就意味着Redis不适合存储大数据,适合存储大数据的是Hadoop生态系 ...
- (转)MongoDB分片实战 集群搭建
环境准备 Linux环境 主机 OS 备注 192.168.32.13 CentOS6.3 64位 普通PC 192.168.71.43 CentOS6.2 64位 服务器,NUMA CPU架构 Mo ...
- Docker下ETCD集群搭建
搭建集群之前首先准备两台安装了CentOS 7的主机,并在其上安装好Docker. Master 10.100.97.46 Node 10.100.97.64 ETCD集群搭建有三种方式,分别是Sta ...
- 使用yum安装CDH Hadoop集群
使用yum安装CDH Hadoop集群 2013.04.06 Update: 2014.07.21 添加 lzo 的安装 2014.05.20 修改cdh4为cdh5进行安装. 2014.10.22 ...
随机推荐
- myEclipse勿删文件怎么恢复
今天码代码的时候项目里有一个jsp文件不小心被删了,又懒得重写,然后发现myEclipse竟然可以恢复被勿删的文件,当然,也仅仅限于最近被删的文件. 具体怎么恢复呢?-------右键点击被删文件所在 ...
- [LeetCode61]Rotate List
题目: Given a list, rotate the list to the right by k places, where k is non-negative. For example:Giv ...
- Android - 分享内容 - 给其他APP发送内容
创建一个intent时,必须要指定intent将要触发的操作.Android定义了很多操作,包括ACTION_SEND,就象可以猜到的一样,表示intent是把数据从一个activity发送给另一个, ...
- NLP | 自然语言处理 - 解析(Parsing, and Context-Free Grammars)
什么是解析? 在自然语言的学习过程,个人一定都学过语法,比如句子能够用主语.谓语.宾语来表示.在自然语言的处理过程中.有很多应用场景都须要考虑句子的语法,因此研究语法解析变得很重要. 语法解析有两个基 ...
- [原创].NET 业务框架开发实战之七 业务层初步构想
原文:[原创].NET 业务框架开发实战之七 业务层初步构想 .NET 业务框架开发实战之七 业务层初步构想 前言:本篇主要讲述如何把DAL和BLL衔接起来. 本篇议题如下: 1. DAL ...
- linux软与硬接线连接
1.Linux链接概念 Linux链接分两种.一种被称为硬链接(Hard Link),还有一种被称为符号链接(Symbolic Link).默认情况下,ln命令产生硬链接. [硬连接] 硬连接指通过索 ...
- 如此的相似,不能-------Day84
生活中我们会遇到一些相似事儿,它可能是一个项目,我发现,你失去非常相似,其结果是不,它可以是人.你认为你一直在等待的是他(她),终于可以找到,只需简单地认为.正是这样相似. js和java语言中有不少 ...
- 设置SQLServer数据库中某些表为只读的多种方法
原文:设置SQLServer数据库中某些表为只读的多种方法 翻译自:http://www.mssqltips.com/sqlservertip/2711/different-ways-to-make- ...
- 搞个这样的APP要多久? (转)
这是一个“如有雷同,纯属巧合”的故事,外加一些废话,大家请勿对号入座.开始了…… 我有些尴尬地拿着水杯,正对面坐着来访的王总,他是在别处打拼的人,这几年据说收获颇丰,见移动互联网如火如荼,自然也想着要 ...
- ERROR 2003 (HY000): Can't connect to MySQL server on '10.16.115.101' (111)
ubuntu安装之后mysql,使用apt-get安装命令,默认为同意只本地访问 root@idata1:~/software# mysql -uroot -p123456 -h10.16.115.1 ...