1      集群环境

主节点

master001 ~~ master006

从节点

slave001 ~~ slave064

2      安装CDH5的YUM源

rpm -Uvhhttp://archive.cloudera.com/cdh5/one-click-install/redhat/6/x86_64/cloudera-cdh-5-0.x86_64.rpm

wgethttp://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/cloudera-cdh5.repo

mv cloudera-cdh5.repo /ect/yum.repo.d/

3      ZooKeeper

3.1    节点分配

ZooKeeperServer :

master002,master003, master004, master005, master006

ZooKeeperClient :

master001,master002, master003, master004, master005, master006

3.2    安装

ZooKeeper Client节点:

yum install -y zookeeper

ZooKeeper Server节点:

yum install -y zookeeper-server

3.3    配置

1.zookeeper节点改动zookeeper配置文件

/etc/zookeeper/conf/zoo.cfg

maxClientCnxns=50

# Thenumber of milliseconds of each tick

tickTime=2000

# Thenumber of ticks that the initial

#synchronization phase can take

initLimit=10

# Thenumber of ticks that can pass between

# sendinga request and getting an acknowledgement

syncLimit=5

# thedirectory where the snapshot is stored.

dataDir=/data/disk01/zookeeper/zk_data

dataLogDir=/data/disk01/zookeeper/zk_log

# theport at which the clients will connect

clientPort=2181

server.2=master002:2888:3888

server.3=master003:2888:3888

server.4=master004:2888:3888

server.5=master005:2888:3888

server.6=master006:2888:3888

2.初始化节点

master002:

service zookeeper-server init --myid=2

master003:

service zookeeper-server init --myid=3

master004:

service zookeeper-server init --myid=4

master005:

service zookeeper-server init --myid=5

master006:

service zookeeper-server init --myid=6

3.执行zookeeper

service zookeeper-server start

3.4    安装路径

程序路径

/usr/lib/zookeeper/

配置文件路径

/etc/zookeeper/conf

日志路径

/var/log/zookeeper

3.5    执行|关闭|查看状态

ZooKeeper

service zookeeper-server start|stop|status

3.6    经常使用命令

查看ZooKeeper节点状态

zookeeper-server status

手动清理日志

/usr/lib/zookeeper/bin/zkCleanup.shdataLogDir [snapDir] -n count

自己主动清理日志

autopurge.purgeInterval 这个參数指定了清理频率,单位是小时,须要填写一个1或更大的整数,默认是0,表示不开启自己清理功能。

autopurge.snapRetainCount 这个參数和上面的參数搭配使用,这个參数指定了须要保留的文件数目。默认是保留3个。

3.7    測试

https://github.com/phunt/zk-smoketest

3.8    參考文献

ZooKeeper參数配置

http://my.oschina.net/u/128568/blog/194820

ZooKeeper常见管理和运维

http://nileader.blog.51cto.com/1381108/1032157

4      HDFS

4.1    节点分配(配置NN HA)

namenode、zkfc:

master002, master003

datanode:

slave001-slave064

journalnode:

master002, master003, master004

4.2    安装

namenode:

yum install hadoop-hdfs-namenode

yum install hadoop-hdfs-zkfc

(yum install -y hadoop-hdfs-namenodehadoop-hdfs-zkfc hadoop-client)

datanode:

yum install hadoop-hdfs-datanode

(yum install -y hadoop-hdfs-datanodehadoop-client)

journalnode:

yum install hadoop-hdfs-journalnode

(yum install -y hadoop-hdfs-journalnode)

全部节点:

yum install hadoop-client

4.3    配置

1.配置文件

/etc/hadoop/conf/core-site.xml

<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

<configuration>

<property>

<name>fs.defaultFS</name>

<value>hdfs://bdcluster</value>

</property>

<property>

<name>fs.trash.interval</name>

<value>1440</value>

</property>

<property>

<name>hadoop.proxyuser.httpfs.hosts</name>

<value>*</value>

</property>

<property>

<name>hadoop.proxyuser.httpfs.groups</name>

<value>*</value>

</property>

</configuration>

/etc/hadoop/conf/hdfs-site.xml

<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

<configuration>

<property>

<name>dfs.nameservices</name>

<value>bdcluster</value>

</property>

<property>

<name>dfs.ha.namenodes.bdcluster</name>

<value>nn002,nn003</value>

</property>

<property>

<name>dfs.namenode.rpc-address.bdcluster.nn002</name>

<value>master002:8020</value>

</property>

<property>

<name>dfs.namenode.rpc-address.bdcluster.nn003</name>

<value>master003:8020</value>

</property>

<property>

<name>dfs.namenode.http-address.bdcluster.nn002</name>

<value>master002:50070</value>

</property>

<property>

<name>dfs.namenode.http-address.bdcluster.nn003</name>

<value>master003:50070</value>

</property>

<property>

<name>dfs.namenode.shared.edits.dir</name>

<value>qjournal://master002:8485;master003:8485;master004:8485/bdcluster</value>

</property>

<property>

<name>dfs.journalnode.edits.dir</name>

<value>/data/disk01/hadoop/hdfs/journalnode</value>

</property>

<property>

<name>dfs.client.failover.proxy.provider.bdcluster</name>

<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

</property>

<property>

<name>dfs.ha.fencing.methods</name>

<value>sshfence</value>

</property>

<property>

<name>dfs.ha.fencing.ssh.private-key-files</name>

<value>/var/lib/hadoop-hdfs/.ssh/id_dsa</value>

</property>

<property>

<name>dfs.ha.automatic-failover.enabled</name>

<value>true</value>

</property>

<property>

<name>ha.zookeeper.quorum</name>

<value>master002:2181,master003:2181,master004:2181,master005:2181,master006:2181</value>

</property>

<property>

<name>dfs.permissions.superusergroup</name>

<value>hadoop</value>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>/data/disk01/hadoop/hdfs/namenode</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>/data/disk01/hadoop/hdfs/datanode,/data/disk02/hadoop/hdfs/datanode,/data/disk03/hadoop/hdfs/datanode,/data/disk04/hadoop/hdfs/datanode,/data/disk05/hadoop/hdfs/datanode,/data/disk06/hadoop/hdfs/datanode,/data/disk07/hadoop/hdfs/datanode</value>

</property>

<property>

<name>dfs.datanode.failed.volumes.tolerated</name>

<value>3</value>

</property>

<property>

<name>dfs.datanode.max.xcievers</name>

<value>4096</value>

</property>

<property>

<name>dfs.webhdfs.enabled</name>

<value>true</value>

</property>

</configuration>

/etc/hadoop/conf/slaves

slave001

slave002

slave064

2.配置hdfs用户的免password登陆

3.创建数据文件夹

namenode

mkdir -p/data/disk01/hadoop/hdfs/namenode

chown -Rhdfs:hdfs /data/disk01/hadoop/hdfs/

chown -Rhdfs:hdfs /data/disk01/hadoop/hdfs/namenode

chmod 700/data/disk01/hadoop/hdfs/namenode

datanode

mkdir -p/data/disk01/hadoop/hdfs/datanode

chmod 700/data/disk01/hadoop/hdfs/datanode

chown -Rhdfs:hdfs /data/disk01/hadoop/hdfs/

mkdir -p/data/disk02/hadoop/hdfs/datanode

chmod 700/data/disk02/hadoop/hdfs/datanode

chown -Rhdfs:hdfs /data/disk02/hadoop/hdfs/

mkdir -p/data/disk03/hadoop/hdfs/datanode

chmod 700/data/disk03/hadoop/hdfs/datanode

chown -Rhdfs:hdfs /data/disk03/hadoop/hdfs/

mkdir -p/data/disk04/hadoop/hdfs/datanode

chmod 700/data/disk04/hadoop/hdfs/datanode

chown -Rhdfs:hdfs /data/disk04/hadoop/hdfs/

mkdir -p/data/disk05/hadoop/hdfs/datanode

chmod 700/data/disk05/hadoop/hdfs/datanode

chown -Rhdfs:hdfs /data/disk05/hadoop/hdfs/

mkdir -p/data/disk06/hadoop/hdfs/datanode

chmod 700/data/disk06/hadoop/hdfs/datanode

chown -Rhdfs:hdfs /data/disk06/hadoop/hdfs/

mkdir -p/data/disk07/hadoop/hdfs/datanode

chmod 700/data/disk07/hadoop/hdfs/datanode

chown -Rhdfs:hdfs /data/disk07/hadoop/hdfs/

journalnode

mkdir -p/data/disk01/hadoop/hdfs/journalnode

chown -Rhdfs:hdfs /data/disk01/hadoop/hdfs/journalnode

4.启动journalnode

service hadoop-hdfs-journalnode start

5.格式化namenode(master002)

sudo -u hdfs hadoop namenode -format

6.在ZooKeeper中初始化HA状态(namenodemaster002)

hdfs zkfc -formatZK

7.初始化Shared Editsdirectory(master002)

hdfs namenode -initializeSharedEdits

8.启动namenode

formatted namenode(master002):

service hadoop-hdfs-namenode start

standby namenode(master003):

sudo -u hdfs hdfs namenode-bootstrapStandby

service hadoop-hdfs-namenode start

9.启动datanode

service hadoop-hdfs-datanode start

10.启动zkfc(namenode)

service hadoop-hdfs-zkfc start

11.初始化HDFS文件夹

/usr/lib/hadoop/libexec/init-hdfs.sh

4.4    安装路径

程序路径

/usr/lib/hadoop-hdfs

配置文件路径

/etc/hadoop/conf

日志路径

/var/log/hadoop-hdfs

4.5    执行|关闭|查看状态

NameNode

service hadoop-hdfs-namenodestart|stop|status

DataNode

service hadoop-hdfs-datanodestart|stop|status

JournalNode

service hadoop-hdfs-journalnodestart|stop|status

zkfc

service hadoop-hdfs-zkfc start|stop|status

4.6    经常使用命令

查看集群状态

sudo -u hdfs hdfs dfsadmin -report

检查文件及其副本

sudo -u hdfs hdfs fsck [文件名称] -files-blocks -locations –racks

5      YARN

5.1    节点分配

resourcemanager:

master004

nodemanager、mapreduce:

slave001-slave064

mapreduce-historyserver:

master006

5.2    安装

resourcemanager:

yum -y install hadoop-yarn-resourcemanager

nodemanager:

yum -y install hadoop-yarn-nodemanagerhadoop-mapreduce

mapreduce-historyserver:

yum -y installhadoop-mapreduce-historyserver hadoop-yarn-proxyserver

全部节点

yum -y install hadoop-client

5.3    配置

1.配置文件

/etc/hadoop/conf/mapred-site.xml

<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

<configuration>

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

<property>

<name>mapreduce.task.io.sort.mb</name>

<value>1024</value>

</property>

<property>

<name>mapred.child.java.opts</name>

<value>-XX:-UseGCOverheadLimit-Xms1024m -Xmx2048m</value>

</property>

<property>

<name>yarn.app.mapreduce.am.command-opts</name>

<value>-Xmx2048m</value>

</property>

<property>

<name>mapreduce.jobhistory.address</name>

<value>master006:10020</value>

<description>MapReduce JobHistoryServer IPC host:port</description>

</property>

<property>

<name>mapreduce.jobhistory.webapp.address</name>

<value>master006:19888</value>

<description>MapReduce JobHistoryServer Web UI host:port</description>

</property>

<property>

<name>mapreduce.map.memory.mb</name>

<value>2048</value>

</property>

<property>

<name>mapreduce.reduce.memory.mb</name>

<value>4096</value>

</property>

<property>

<name>mapreduce.jobhistory.intermediate-done-dir</name>

<value>/user/history/done_intermediate</value>

</property>

<property>

<name>mapreduce.jobhistory.done-dir</name>

<value>/user/history/done</value>

</property>

</configuration>

/etc/hadoop/conf/yarn-site.xml

<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

<configuration>

<property>

<name>yarn.resourcemanager.resource-tracker.address</name>

<value>master004:8031</value>

</property>

<property>

<name>yarn.resourcemanager.address</name>

<value>master004:8032</value>

</property>

<property>

<name>yarn.resourcemanager.scheduler.address</name>

<value>master004:8030</value>

</property>

<property>

<name>yarn.resourcemanager.admin.address</name>

<value>master004:8033</value>

</property>

<property>

<name>yarn.resourcemanager.webapp.address</name>

<value>master004:8088</value>

</property>

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

<property>

<name>yarn.log-aggregation-enable</name>

<value>true</value>

</property>

<property>

<description>List of directories tostore localized files in.</description>

<name>yarn.nodemanager.local-dirs</name>

<value>/data/disk01/hadoop/yarn/local,/data/disk02/hadoop/yarn/local, /data/disk03/hadoop/yarn/local,/data/disk04/hadoop/yarn/local, /data/disk05/hadoop/yarn/local</value>

</property>

<property>

<description>Where to store containerlogs.</description>

<name>yarn.nodemanager.log-dirs</name>

<value>/data/disk01/hadoop/yarn/logs,/data/disk02/hadoop/yarn/logs, /data/disk03/hadoop/yarn/logs,/data/disk04/hadoop/yarn/logs, /data/disk05/hadoop/yarn/logs</value>

</property>

<!--property>

<description>Where to aggregate logsto.</description>

<name>yarn.nodemanager.remote-app-log-dir</name>

<value>/var/log/hadoop-yarn/apps</value>

</property-->

<property>

<description>Classpath for typicalapplications.</description>

<name>yarn.application.classpath</name>

<value>

$HADOOP_CONF_DIR,

$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,

$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,

$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,

$HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*

</value>

</property>

<property>

<name>yarn.app.mapreduce.am.staging-dir</name>

<value>/user</value>

</property>

<property>

<description>The minimum allocationfor every container request at the RM,

in MBs. Memory requests lower than thiswon't take effect,

and the specified value will get allocatedat minimum.</description>

<name>yarn.scheduler.minimum-allocation-mb</name>

<value>1024</value>

</property>

<property>

<description>The maximum allocationfor every container request at the RM,

in MBs. Memory requests higher than thiswon't take effect,

and will get capped to thisvalue.</description>

<name>yarn.scheduler.maximum-allocation-mb</name>

<value>16384</value>

</property>

<property>

<description>The minimum allocationfor every container request at the RM,

in terms of virtual CPU cores. Requestslower than this won't take effect,

and the specified value will get allocatedthe minimum.</description>

<name>yarn.scheduler.minimum-allocation-vcores</name>

<value>1</value>

</property>

<property>

<description>The maximum allocationfor every container request at the RM,

in terms of virtual CPU cores. Requestshigher than this won't take effect,

and will get capped to thisvalue.</description>

<name>yarn.scheduler.maximum-allocation-vcores</name>

<value>32</value>

</property>

<property>

<description>Number of CPU cores thatcan be allocated

for containers.</description>

<name>yarn.nodemanager.resource.cpu-vcores</name>

<value>48</value>

</property>

<property>

<description>Amount of physicalmemory, in MB, that can be allocated

for containers.</description>

<name>yarn.nodemanager.resource.memory-mb</name>

<value>120000</value>

</property>

<property>

<description>Ratio between virtualmemory to physical memory when

setting memory limits for containers.Container allocations are

expressed in terms of physical memory, andvirtual memory usage

is allowed to exceed this allocation bythis ratio.

</description>

<name>yarn.nodemanager.vmem-pmem-ratio</name>

<value>6</value>

</property>

</configuration>

2. nodemanager创建本地文件夹

mkdir -p/data/disk01/hadoop/yarn/local /data/disk02/hadoop/yarn/local/data/disk03/hadoop/yarn/local /data/disk04/hadoop/yarn/local/data/disk05/hadoop/yarn/local

mkdir -p/data/disk01/hadoop/yarn/logs /data/disk02/hadoop/yarn/logs/data/disk03/hadoop/yarn/logs /data/disk04/hadoop/yarn/logs/data/disk05/hadoop/yarn/logs

chown -Ryarn:yarn /data/disk01/hadoop/yarn /data/disk02/hadoop/yarn/local/data/disk03/hadoop/yarn /data/disk04/hadoop/yarn /data/disk05/hadoop/yarn

chown -Ryarn:yarn /data/disk01/hadoop/yarn/local /data/disk02/hadoop/yarn/local/data/disk03/hadoop/yarn/local /data/disk04/hadoop/yarn/local/data/disk05/hadoop/yarn/local

chown -Ryarn:yarn /data/disk01/hadoop/yarn/logs /data/disk02/hadoop/yarn/logs/data/disk03/hadoop/yarn/logs /data/disk04/hadoop/yarn/logs/data/disk05/hadoop/yarn/logs

3. 创建history文件夹

sudo -u hdfs hadoop fs -mkdir /user/history

sudo -u hdfs hadoop fs -chmod -R 1777/user/history

sudo -u hdfs hadoop fs -chown yarn/user/history

4. 启动服务

resourcemanager:

sudo service hadoop-yarn-resourcemanagerstart

nodemanager:

sudo service hadoop-yarn-nodemanager start

mapreduce-historyserver:

sudo service hadoop-mapreduce-historyserverstart

5.4    安装路径

程序路径

/usr/lib/hadoop-yarn

配置文件路径

/etc/hadoop/conf

日志路径

/var/log/hadoop-yarn

5.5    执行|关闭|查看状态

resourcemanager:

service hadoop-yarn-resourcemanagerstart|stop|status

nodemanager:

service hadoop-yarn-nodemanagerstart|stop|status

mapreduce-historyserver:

service hadoop-mapreduce-historyserverstart|stop|status

Edit

5.6    经常使用命令

查看节点状态

yarn node -list -all

resourcemanager管理

yarm rmadmin ...

6      HBase

6.1    节点分配

hbase-master

master004, master005, master006

hbase-regionserver

slave001 ~~ 064

hbase-thrift

master004, master005, master006

hbase-rest

master004, master005, master006

6.2    安装

hbase-master

yum install -y hbase hbase-master

hbase-regionserver

yum install -y hbase hbase-regionserver

hbase-thrift

yum install -y hbase-thrift

hbase-rest

yum install -y hbase-rest

6.3    配置

1.配置文件

/etc/security/limits.conf

hdfs -nofile 32768

hbase -nofile 32768

/etc/hbase/conf/hbase-site.xml

<?xml-stylesheettype="text/xsl" href="configuration.xsl"?>

<configuration>

<property>

<name>hbase.rest.port</name>

<value>60050</value>

</property>

<property>

<name>hbase.zookeeper.quorum</name>

<value>master002, master003,master004, master005,master006</value>

</property>

<property>

<name>hbase.cluster.distributed</name>

<value>true</value>

</property>

<property>

<name>hbase.tmp.dir</name>

<value>/tmp/hadoop/hbase</value>

</property>

<property>

<name>hbase.rootdir</name>

<value>hdfs://bdcluster/hbase/</value>

</property>

</configuration>

/etc/hbase/conf/hbase-env.sh

# Setenvironment variables here.

# Thisscript sets variables multiple times over the course of starting an hbaseprocess,

# so tryto keep things idempotent unless you want to take an even deeper look

# intothe startup scripts (bin/hbase, etc.)

# Thejava implementation to use.  Java 1.6required.

# exportJAVA_HOME=/usr/java/default/

# ExtraJava CLASSPATH elements.  Optional.

# exportHBASE_CLASSPATH=

# Themaximum amount of heap to use, in MB. Default is 1000.

# exportHBASE_HEAPSIZE=1000

# ExtraJava runtime options.

# Beloware what we set by default.  May onlywork with SUN JVM.

# Formore on why as well as other possible settings,

# seehttp://wiki.apache.org/hadoop/PerformanceTuning

exportHBASE_OPTS="-XX:+UseConcMarkSweepGC"

#Uncomment one of the below three options to enable java garbage collectionlogging for the server-side processes.

# Thisenables basic gc logging to the .out file.

# exportSERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails-XX:+PrintGCDateStamps"

exportSERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps$HBASE_GC_OPTS"

exportSERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M$HBASE_GC_OPTS"

# Thisenables basic gc logging to its own file.

# IfFILE-PATH is not replaced, the log file(.gc) would still be generated in theHBASE_LOG_DIR .

# exportSERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps-Xloggc:<FILE-PATH>"

# Thisenables basic GC logging to its own file with automatic log rolling. Onlyapplies to jdk 1.6.0_34+ and 1.7.0_2+.

# IfFILE-PATH is not replaced, the log file(.gc) would still be generated in theHBASE_LOG_DIR .

# exportSERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps-Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1-XX:GCLogFileSize=512M"

#Uncomment one of the below three options to enable java garbage collectionlogging for the client processes.

# Thisenables basic gc logging to the .out file.

# exportCLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails-XX:+PrintGCDateStamps"

exportCLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps$HBASE_GC_OPTS"

# Thisenables basic gc logging to its own file.

# IfFILE-PATH is not replaced, the log file(.gc) would still be generated in theHBASE_LOG_DIR .

# exportCLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps-Xloggc:<FILE-PATH>"

# Thisenables basic GC logging to its own file with automatic log rolling. Onlyapplies to jdk 1.6.0_34+ and 1.7.0_2+.

# IfFILE-PATH is not replaced, the log file(.gc) would still be generated in theHBASE_LOG_DIR .

# exportCLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps-Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1-XX:GCLogFileSize=512M"

#Uncomment below if you intend to use the EXPERIMENTAL off heap cache.

# exportHBASE_OPTS="$HBASE_OPTS -XX:MaxDirectMemorySize="

# Sethbase.offheapcache.percentage in hbase-site.xml to a nonzero value.

exportHBASE_USE_GC_LOGFILE=true

#Uncomment and adjust to enable JMX exporting

# Seejmxremote.password and jmxremote.access in $JRE_HOME/lib/management toconfigure remote password access.

# Moredetails at:http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html

#

# exportHBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false-Dcom.sun.management.jmxremote.authenticate=false"

# exportHBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE-Dcom.sun.management.jmxremote.port=10101"

# exportHBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE-Dcom.sun.management.jmxremote.port=10102"

# exportHBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE-Dcom.sun.management.jmxremote.port=10103"

# exportHBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE-Dcom.sun.management.jmxremote.port=10104"

# exportHBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10105"

# Filenaming hosts on which HRegionServers will run. $HBASE_HOME/conf/regionservers by default.

# exportHBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers

#Uncomment and adjust to keep all the Region Server pages mapped to be memoryresident

#HBASE_REGIONSERVER_MLOCK=true

#HBASE_REGIONSERVER_UID="hbase"

# Filenaming hosts on which backup HMaster will run. $HBASE_HOME/conf/backup-masters by default.

# exportHBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters

# Extrassh options.  Empty by default.

# exportHBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR"

# Wherelog files are stored.  $HBASE_HOME/logsby default.

# exportHBASE_LOG_DIR=${HBASE_HOME}/logs

# Enableremote JDWP debugging of major HBase processes. Meant for Core Developers

# exportHBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"

# exportHBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"

# exportHBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"

# exportHBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"

# Astring representing this instance of hbase. $USER by default.

# exportHBASE_IDENT_STRING=$USER

# Thescheduling priority for daemon processes. See 'man nice'.

# exportHBASE_NICENESS=10

# Thedirectory where pid files are stored. /tmp by default.

# exportHBASE_PID_DIR=/var/hadoop/pids

# Secondsto sleep between slave commands.  Unsetby default.  This

# can beuseful in large clusters, where, e.g., slave rsyncs can

#otherwise arrive faster than the master can service them.

# exportHBASE_SLAVE_SLEEP=0.1

# TellHBase whether it should manage it's own instance of Zookeeper or not.

exportHBASE_MANAGES_ZK=false

# Thedefault log rolling policy is RFA, where the log file is rolled as per the sizedefined for the

# RFAappender. Please refer to the log4j.properties file to see more details on thisappender.

# In caseone needs to do log rolling on a date change, one should set the environmentproperty

#HBASE_ROOT_LOGGER to "<DESIRED_LOG LEVEL>,DRFA".

# Forexample:

#HBASE_ROOT_LOGGER=INFO,DRFA

# Thereason for changing default to RFA is to avoid the boundary case of filling outdisk space as

# DRFAdoesn't put any cap on the log size. Please refer to HBase-5655 for morecontext.

2. 启动

hbase-master

service hbase-master start

hbase-regionserver

service hbase-regionserver start

hbase-thrift

service hbase-thrift start

hbase-rest

service hbase-rest start

6.4    安装路径

安装路径

/usr/lib/hbase

配置文件路径

/etc/hbase/conf

日志路径

/var/log/hbase

6.5    执行|关闭|查看状态

hbase-master:

service hbase-master start|stop|status

hbase-regionserver:

service hbase-regionserverstart|stop|status

hbase-thrift:

service hbase-thrift start|stop|status

hbase-rest:

service hbase-rest start|stop|status

6.6    经常使用命令

hbase shell

7      Spark

7.1    节点分配

master002 ~~ master006

7.2    安装

yum install spark-core spark-masterspark-worker spark-python

7.3    配置

1. /etc/spark/conf/spark-env.sh

export SPARK_HOME=/usr/lib/spark

2. 部署Spark到HDFS

source /etc/spark/conf/spark-env.sh

hdfs dfs -mkdir -p /user/spark/share/lib

sudo -u hdfs hdfs dfs -put/usr/lib/spark/assembly/lib/spark-assembly_2.10-0.9.0-cdh5.0.0-hadoop2.3.0-cdh5.0.0.jar/user/spark/share/lib/spark-assembly.jar

7.4    安装路径

程序路径

/usr/lib/spark

配置文件路径

/etc/spark/conf

日志路径

/var/log/spark

spark在hdfs的路径

/user/spark/share/lib/spark-assembly.jar

7.5    演示样例程序

source /etc/spark/conf/spark-env.sh

SPARK_JAR=hdfs://bdcluster/user/spark/share/lib/spark-assembly.jarAPP_JAR=$SPARK_HOME/examples/lib/spark-examples_2.10-0.9.0-cdh5.0.0.jar$SPARK_HOME/bin/spark-class org.apache.spark.deploy.yarn.Client --jar $APP_JAR--class org.apache.spark.examples.SparkPi
--args yarn-standalone --args 10

Cloudera CDH 5集群搭建(yum 方式)的更多相关文章

  1. 3.环境搭建-Hadoop(CDH)集群搭建

    目录 目录 实验环境 安装 Hadoop 配置文件 在另外两台虚拟机上搭建hadoop 启动hdfs集群 启动yarn集群 本文主要是在上节CentOS集群基础上搭建Hadoop集群. 实验环境 Ha ...

  2. Redis Cluster集群搭建与应用

    1.redis-cluster设计 Redis集群搭建的方式有多种,例如使用zookeeper,但从redis 3.0之后版本支持redis-cluster集群,redis-cluster采用无中心结 ...

  3. centos7下Etcd3集群搭建

    一.环境介绍 etcd主要功能是分布式的存储键值,优点不多说了,分布是集群,自动选举等等,自行百度,主要说下配置方法,折腾了几天,终于优点眉目了,记录下操作方法,本文参考了如下链接 https://w ...

  4. Redis Cluster集群搭建<原>

    一.环境配置 一台window 7上安装虚拟机,虚拟机中安装的是centos系统. 二.目标     Redis集群搭建的方式有多种,根据集群逻辑的位置,大致可以分为三大类:基于客户端分片的Redis ...

  5. 分享知识-快乐自己:redis集群搭建

    Redis介绍: 1.开源的NoSql数据库 2.C语言编写 3.基于内存运行,并且支持持久化 4.Key value存储 5.是主流的Nosql数据库之一 Redis优点: 1.内存使用方面,表现优 ...

  6. Redis Cluster集群详介绍和伪集群搭建

    1 什么是Redis-Cluster 为何要搭建Redis集群.Redis是在内存中保存数据的,而我们的电脑一般内存都不大,这也就意味着Redis不适合存储大数据,适合存储大数据的是Hadoop生态系 ...

  7. (转)MongoDB分片实战 集群搭建

    环境准备 Linux环境 主机 OS 备注 192.168.32.13 CentOS6.3 64位 普通PC 192.168.71.43 CentOS6.2 64位 服务器,NUMA CPU架构 Mo ...

  8. Docker下ETCD集群搭建

    搭建集群之前首先准备两台安装了CentOS 7的主机,并在其上安装好Docker. Master 10.100.97.46 Node 10.100.97.64 ETCD集群搭建有三种方式,分别是Sta ...

  9. 使用yum安装CDH Hadoop集群

    使用yum安装CDH Hadoop集群 2013.04.06 Update: 2014.07.21 添加 lzo 的安装 2014.05.20 修改cdh4为cdh5进行安装. 2014.10.22  ...

随机推荐

  1. Unity3D方法来隐藏和显示对象

    Unity3D作 在使用unity3d开发游戏的过程中.我们经常会遇到须要隐藏或者显示的操作,针对这一点,以下做了一些总结. 一.设置Renderer状态 在游戏的开发中,全部可以被渲染的物体都包括有 ...

  2. 公钥\私人 ssh避password登陆

    相关概念以前见过,决不要注意,使用公共密钥管理之前,腾讯云主机的备案机,非常头发的感觉,查了一下相关资料,这里总结下: 字符a:192.168.7.188 (ubuntu) 字符b:192.168.7 ...

  3. Vs2010 配置驱动的开发环境

    我已被用来VS2010开发环境,之前曾经与vs2010驱动的开发环境.重装系统,一次又一次的配置,找了好几篇文章,配置没有成功,在配置阶段突然成功了,直接把原来的驱动程序的配置文件将能够接管使用. 当 ...

  4. resharper 设置代码颜色

  5. Insecure default in Elasticsearch enables remote code execution

    Elasticsearch has a flaw in its default configuration which makes it possible for any webpage to exe ...

  6. ICT工作的思考&lt;两&gt;

    2周奋战.我负责的LB昨天完成了最后一个模块.最后20日. 一周早于预期,经理说,出乎他的意料.So 奖励表,昨日,管理人员与我们合作,吃烧烤补补身子.我只想说,最后一个喘息. 这两周的生活确挺忙碌的 ...

  7. 如何设置ubuntu自己主动的睡眠时间

    我相信很多在学习linux的过程中.总会遇到,不时系统会冬眠自己主动,因此,即使再次输入password.么重要的内容怕别人看 另外假设你常常使用像Putty这样子的远程登录软件的话,假设你的linu ...

  8. JGroups 入门实践(转)

    前言 JGroups是一个开源的纯java编写的可靠的群组通讯工具.其工作模式基于IP多播,但可以在可靠性和群组成员管理上进行扩展.其结构上设计灵活,提供了一种灵活兼容多种协议的协议栈. JGroup ...

  9. GDI+ 两个汇总 : 为什么CImage类别是根据GDI+的?

    在很多资料上都说CImage类是基于GDI+的,可是为什么是基于GDI+的呢? 由于使用这个类时,并没有增加#include <gdiplus.h> .也没有在程序開始和结束时分别写GDI ...

  10. Android:ViewPager扩展的具体解释——导航ViewPagerIndicator(有图片缓存,异步加载图片)

    我们已经用viewpager该. github那里viewpager扩展,导航风格更丰富.这个开源项目ViewPagerIndicator.非常好用,但样品是比较简单,实际用起来是非常不延长.例如,在 ...