Hadoop + ZK + HBase 环境搭建
Hadoop 环境搭建
参考资料:
http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
http://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
下载 2.4.1 bin 包, 解压好以后按照链接上配置各个配置文件, 启动时会遇到 "Unable to load realm info from SCDynamicStore" 的问题, 这个问题需要在 hadoop-env.sh 中加入如下配置(配置 HBase 的时候也会遇到这个问题, 使用同样的方法在 hbase-env.sh 中加入如下配置解决)
hadoop-env.sh(hbase-env.sh) 配置, 增加
export JAVA_HOME="/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home"
export HBASE_OPTS="-Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"
最后自己写一下启动和停止脚本
hadoop-start.sh
#!/bin/bash HADOOP_PREFIX="/Users/zhenweiliu/Work/Software/hadoop-2.4.1"
HADOOP_YARN_HOME="/Users/zhenweiliu/Work/Software/hadoop-2.4.1"
HADOOP_CONF_DIR="/Users/zhenweiliu/Work/Software/hadoop-2.4.1/etc/hadoop"
cluster_name="hadoop_cat"
# Format a new distributed filesystem
if [ "$1" == "format" ]; then
$HADOOP_PREFIX/bin/hdfs namenode -format $cluster_name
fi
# Start the HDFS with the following command, run on the designated NameNode:
$HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode # Run a script to start DataNodes on all slaves:
$HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode # Start the YARN with the following command, run on the designated ResourceManager:
$HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager # Run a script to start NodeManagers on all slaves:
$HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager # Start a standalone WebAppProxy server. If multiple servers are used with load balancing it should be run on each of them:
$HADOOP_YARN_HOME/sbin/yarn-daemon.sh start proxyserver --config $HADOOP_CONF_DIR # Start the MapReduce JobHistory Server with the following command, run on the designated server:
$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR
hadoop-stop.sh
#!/bin/bash HADOOP_PREFIX="/Users/zhenweiliu/Work/Software/hadoop-2.4.1"
HADOOP_YARN_HOME="/Users/zhenweiliu/Work/Software/hadoop-2.4.1"
HADOOP_CONF_DIR="/Users/zhenweiliu/Work/Software/hadoop-2.4.1/etc/hadoop"
cluster_name="hadoop_cat" # Stop the NameNode with the following command, run on the designated NameNode:
$HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop namenode # Run a script to stop DataNodes on all slaves:
$HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop datanode # Stop the ResourceManager with the following command, run on the designated ResourceManager:
$HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop resourcemanager # Run a script to stop NodeManagers on all slaves:
$HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop nodemanager # Stop the WebAppProxy server. If multiple servers are used with load balancing it should be run on each of them:
$HADOOP_YARN_HOME/sbin/yarn-daemon.sh stop proxyserver --config $HADOOP_CONF_DIR # Stop the MapReduce JobHistory Server with the following command, run on the designated server:
$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver --config $HADOOP_CONF_DIR
hadoop-restart.sh
#!/bin/bash
./hadoop-stop.sh
./hadoop-start.sh
最后是我的各项需要配置的 hadoop 配置
core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
--> <!-- Put site-specific property overrides in this file. --> <configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
</configuration>
hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
--> <!-- Put site-specific property overrides in this file. --> <configuration> <!-- NameNode Configurations -->
<property>
<name>dfs.datanode.max.xcievers</name>
<value>4096</value>
</property>
<property>
<name>dfs.datanode.datadir</name>
<value>file:///Users/zhenweiliu/Work/Software/hadoop-2.4.1/data</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>67108864</value>
</property>
<property>
<name>dfs.namenode.handler.count</name>
<value>100</value>
</property> <!-- Datanode Configurations -->
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///Users/zhenweiliu/Work/Software/hadoop-2.4.1/name</value>
</property>
</configuration>
yarn-site.xml
?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration> <!-- ResourceManager and NodeManager Configurations -->
<property>
<name>yarn.acl.enable</name>
<value>false</value>
</property>
<property>
<name>yarn.acl.enable</name>
<value>false</value>
</property> <!-- ResourceManager Configurations -->
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:9001</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>localhost:9002</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>localhost:9003</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>localhost:9004</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>localhost:9005</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>CapacityScheduler</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>8192</value>
</property> <!-- NodeManager Configurations -->
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>8192</value>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>2.1</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>${hadoop.tmp.dir}/nm-local-dir</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>${yarn.log.dir}/userlogs</value>
</property>
<property>
<name>yarn.nodemanager.log.retain-seconds</name>
<value>10800</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/logs</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir-suffix</name>
<value>logs</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property> <!-- History Server Configurations -->
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>-1</value>
</property>
<property>
<name>yarn.log-aggregation.retain-check-interval-seconds</name>
<value>-1</value>
</property>
</configuration>
mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
--> <!-- Put site-specific property overrides in this file. --> <configuration> <!-- Configurations for MapReduce Applications -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>1536</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx1024M</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>3072</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx2560M</value>
</property>
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>512</value>
</property>
<property>
<name>mapreduce.task.io.sort.factor</name>
<value>100</value>
</property>
<property>
<name>mapreduce.reduce.shuffle.parallelcopies</name>
<value>50</value>
</property> <!-- Configurations for MapReduce JobHistory Server -->
<property>
<name>mapreduce.jobhistory.address</name>
<value>localhost:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>localhost:19888</value>
</property>
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>file:////Users/zhenweiliu/Work/Software/hadoop-2.4.1/mr-history/tmp</value>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>file:////Users/zhenweiliu/Work/Software/hadoop-2.4.1/mr-history/done</value>
</property> </configuration>
ZK伪分布式配置
复制 3 个 ZK 实例文件夹, 分别为
zookeeper-3.4.5-1
zookeeper-3.4.5-2
zookeeper-3.4.5-3
每个 ZK 文件下的 zoo.cfg 配置如下
zookeeper-3.4.5-1/zoo.cfg
tickTime=
initLimit=
syncLimit=
dataDir=/Users/zhenweiliu/Work/Software/zookeeper/zookeeper-3.4.5-1/data
dataLogDir=/Users/zhenweiliu/Work/Software/zookeeper/zookeeper-3.4.5-1/logs
clientPort=
server.=127.0.0.1::
server.=127.0.0.1::
server.=127.0.0.1::
zookeeper-3.4.5-2/zoo.cfg
tickTime=
initLimit=
syncLimit=
dataDir=/Users/zhenweiliu/Work/Software/zookeeper/zookeeper-3.4.5-2/data
dataLogDir=/Users/zhenweiliu/Work/Software/zookeeper/zookeeper-3.4.5-2/logs
clientPort=
server.=127.0.0.1::
server.=127.0.0.1::
server.=127.0.0.1::
zookeeper-3.4.5-3/zoo.cfg
tickTime=
initLimit=
syncLimit=
dataDir=/Users/zhenweiliu/Work/Software/zookeeper/zookeeper-3.4.5-3/data
dataLogDir=/Users/zhenweiliu/Work/Software/zookeeper/zookeeper-3.4.5-3/logs
clientPort=
server.=127.0.0.1::
server.=127.0.0.1::
server.=127.0.0.1::
然后在每个实例的 data 文件夹下创建一个文件 myid, 文件内分别写入 1, 2, 3 三个字符, 例如
zookeeper-3.4.5-1/data/myid
1
最后做一个批量启动, 停止脚本
startZkCluster.sh
#!/bin/bash BASE_DIR="/Users/zhenweiliu/Work/Software/zookeeper/zookeeper-3.4.5"
BIN_EXEC="bin/zkServer.sh start" for no in $(seq )
do
$BASE_DIR"-"$no/$BIN_EXEC
done
stopZkCluster.sh
#!/bin/bash BASE_DIR="/Users/zhenweiliu/Work/Software/zookeeper/zookeeper-3.4.5"
BIN_EXEC="bin/zkServer.sh stop" for no in $(seq )
do
$BASE_DIR"-"$no/$BIN_EXEC
done
restartZkCluster.sh
#!/bin/bash ./stopZkCluster.sh
./startZkCluster.sh
HBase
参考资料:
http://abloz.com/hbase/book.html
实际上 HBase 内置了 ZK, 如果不显式指定 ZK 的配置, 他会使用内置的 ZK, 这个 ZK 会随着 HBase 启动而启动
hbase-env.sh 中显式启动内置 ZK
export HBASE_MANAGES_ZK=true
hbase-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
/**
*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
-->
<configuration>
<!--
<property>
<name>hbase.rootdir</name>
<value>file:///Users/zhenweiliu/Work/Software/hbase-0.98.3-hadoop2/hbase</value>
</property>
-->
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
<description>The directory shared by RegionServers.</description>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>The replication count for HLog and HFile storage. Should not be greater than HDFS datanode count.</description>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/Users/zhenweiliu/Work/Software/hbase-0.98.3-hadoop2/zookeeper</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2222</value>
<description>Property from ZooKeeper's config zoo.cfg.
The port at which the clients will connect.
</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
</configuration>
最后启动 hbase
./start-hbase.sh
系统参数
另外, hbase 需要大得 processes 数以及 open files 数, 所以需要修改 ulimit, 我的 mac 下增加 /etc/launchd.conf 文件, 文件内容
limit maxfiles
limit maxproc
在 /etc/profile 添加
ulimit -n
ulimit -u
如果 hbase 出现
2014-07-14 23:00:48,342 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:90)
at org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:73)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:695)
1. 查看 hbase master log, 发现
2014-07-14 23:31:51,270 INFO [master:192.168.126.8:60000] util.FSUtils: Waiting for dfs to exit safe mode...
退出 hadoop 安全模式
bin/hdfs dfsadmin -safemode leave
master log 报错
2014-07-14 23:32:22,238 WARN [master:192.168.126.8:60000] hdfs.DFSClient: DFS Read
org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1761102757-192.168.126.8-1404787541755:blk_1073741825_1001 file=/hbase/hbase.version
检查 hdfs
./hdfs fsck / -files -blocks
// :: WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connecting to namenode via http://localhost:50070
FSCK started by zhenweiliu (auth:SIMPLE) from /127.0.0.1 for path / at Mon Jul :: CST
.
/hbase/WALs/192.168.126.8,,-splitting/192.168.126.8%2C60020%2C1404917152583.: CORRUPT blockpool BP--192.168.126.8- block blk_1073741842 /hbase/WALs/192.168.126.8,,-splitting/192.168.126.8%2C60020%2C1404917152583.: MISSING blocks of total size B..
/hbase/WALs/192.168.126.8,,-splitting/192.168.126.8%2C60020%2C1404917152583..meta: CORRUPT blockpool BP--192.168.126.8- block blk_1073741843 /hbase/WALs/192.168.126.8,,-splitting/192.168.126.8%2C60020%2C1404917152583..meta: MISSING blocks of total size B..
/hbase/data/hbase/meta/.tabledesc/.tableinfo.: CORRUPT blockpool BP--192.168.126.8- block blk_1073741829 /hbase/data/hbase/meta/.tabledesc/.tableinfo.: MISSING blocks of total size B..
/hbase/data/hbase/meta//.regioninfo: CORRUPT blockpool BP--192.168.126.8- block blk_1073741827 /hbase/data/hbase/meta//.regioninfo: MISSING blocks of total size B..
/hbase/data/hbase/meta//info/e63bf8b1e649450895c36f28fb88da98: CORRUPT blockpool BP--192.168.126.8- block blk_1073741836 /hbase/data/hbase/meta//info/e63bf8b1e649450895c36f28fb88da98: MISSING blocks of total size B..
/hbase/data/hbase/meta//oldWALs/hlog.: CORRUPT blockpool BP--192.168.126.8- block blk_1073741828 /hbase/data/hbase/meta//oldWALs/hlog.: MISSING blocks of total size B..
/hbase/data/hbase/namespace/.tabledesc/.tableinfo.: CORRUPT blockpool BP--192.168.126.8- block blk_1073741832 /hbase/data/hbase/namespace/.tabledesc/.tableinfo.: MISSING blocks of total size B..
/hbase/data/hbase/namespace/a3fbb84530e05cab6319257d03975e6b/.regioninfo: CORRUPT blockpool BP--192.168.126.8- block blk_1073741833 /hbase/data/hbase/namespace/a3fbb84530e05cab6319257d03975e6b/.regioninfo: MISSING blocks of total size B..
/hbase/data/hbase/namespace/a3fbb84530e05cab6319257d03975e6b/info/770eb1a6dc76458fb97e9213edb80b72: CORRUPT blockpool BP--192.168.126.8- block blk_1073741837 /hbase/data/hbase/namespace/a3fbb84530e05cab6319257d03975e6b/info/770eb1a6dc76458fb97e9213edb80b72: MISSING blocks of total size B..
/hbase/hbase.id: CORRUPT blockpool BP--192.168.126.8- block blk_1073741826 /hbase/hbase.id: MISSING blocks of total size B..
/hbase/hbase.version: CORRUPT blockpool BP--192.168.126.8- block blk_1073741825 /hbase/hbase.version: MISSING blocks of total size B.Status: CORRUPT
Total size: B
Total dirs:
Total files:
Total symlinks:
Total blocks (validated): (avg. block size B)
********************************
CORRUPT FILES:
MISSING BLOCKS:
MISSING SIZE: B
CORRUPT BLOCKS:
********************************
Minimally replicated blocks: (0.0 %)
Over-replicated blocks: (0.0 %)
Under-replicated blocks: (0.0 %)
Mis-replicated blocks: (0.0 %)
Default replication factor:
Average block replication: 0.0
Corrupt blocks:
Missing replicas:
Number of data-nodes:
Number of racks:
FSCK ended at Mon Jul :: CST in milliseconds The filesystem under path '/' is CORRUPT
执行删除
./hdfs fsck -delete
// :: WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connecting to namenode via http://localhost:50070
FSCK started by zhenweiliu (auth:SIMPLE) from /127.0.0.1 for path / at Mon Jul :: CST
Status: HEALTHY
Total size: B
Total dirs:
Total files:
Total symlinks:
Total blocks (validated):
Minimally replicated blocks:
Over-replicated blocks:
Under-replicated blocks:
Mis-replicated blocks:
Default replication factor:
Average block replication: 0.0
Corrupt blocks:
Missing replicas:
Number of data-nodes:
Number of racks:
FSCK ended at Mon Jul :: CST in milliseconds The filesystem under path '/' is HEALTHY
这时发现 hbase 挂了, 查看 master log
-- ::, FATAL [master:192.168.126.8:] master.HMaster: Unhandled exception. Starting shutdown.
org.apache.hadoop.hbase.util.FileSystemVersionException: HBase file layout needs to be upgraded. You have version null and I want version . Is your hbase.rootdir valid? If so, you may need to run 'hbase hbck -fixVersionFile'.
at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:)
at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:)
at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:)
at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:)
at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:)
at java.lang.Thread.run(Thread.java:)
重建一下 hdfs/hbase 文件
bin/hadoop fs -rm -r /hbase
hbase master 报错
-- ::, INFO [master:192.168.126.8:] catalog.CatalogTracker: Failed verification of hbase:meta,, at address=192.168.126.8,,, exception=org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region hbase:meta,, is not online on 192.168.126.8,,
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:)
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:)
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionInfo(HRegionServer.java:)
at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$.callBlockingMethod(AdminProtos.java:)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$(SimpleRpcScheduler.java:)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$.run(SimpleRpcScheduler.java:)
at java.lang.Thread.run(Thread.java:)
重建 region sever 节点
bin/hbase zkcli
rmr /hbase/meta-region-server
再次重启 hbase, 解决
HBase 重要参数
这些参数在 hbase-site.xml 里配置
1. zookeeper.session.timeout
这个默认值是3分钟。这意味着一旦一个server宕掉了,Master至少需要3分钟才能察觉到宕机,开始恢复。你可能希望将这个超时调短,这样Master就能更快的察觉到了。在你调这个值之前,你需要确认你的JVM的GC参数,否则一个长时间的GC操作就可能导致超时。(当一个RegionServer在运行一个长时间的GC的时候,你可能想要重启并恢复它).
要想改变这个配置,可以编辑 hbase-site.xml
, 将配置部署到全部集群,然后重启。
我们之所以把这个值调的很高,是因为我们不想一天到晚在论坛里回答新手的问题。“为什么我在执行一个大规模数据导入的时候Region Server死掉啦”,通常这样的问题是因为长时间的GC操作引起的,他们的JVM没有调优。我们是这样想的,如果一个人对HBase不很熟悉,不能期望他知道所有,打击他的自信心。等到他逐渐熟悉了,他就可以自己调这个参数了。
2. hbase.regionserver.handler.count
这个设置决定了处理用户请求的线程数量。默认是10,这个值设的比较小,主要是为了预防用户用一个比较大的写缓冲,然后还有很多客户端并发,这样region servers会垮掉。有经验的做法是,当请求内容很大(上MB,如大puts, 使用缓存的scans)的时候,把这个值放低。请求内容较小的时候(gets, 小puts, ICVs, deletes),把这个值放大。
当客户端的请求内容很小的时候,把这个值设置的和最大客户端数量一样是很安全的。一个典型的例子就是一个给网站服务的集群,put操作一般不会缓冲,绝大多数的操作是get操作。
把这个值放大的危险之处在于,把所有的Put操作缓冲意味着对内存有很大的压力,甚至会导致OutOfMemory.一个运行在内存不足的机器的RegionServer会频繁的触发GC操作,渐渐就能感受到停顿。(因为所有请求内容所占用的内存不管GC执行几遍也是不能回收的)。一段时间后,集群也会受到影响,因为所有的指向这个region的请求都会变慢。这样就会拖累集群,加剧了这个问题。
你可能会对handler太多或太少有感觉,可以通过 Section 12.2.2.1, “启用 RPC级 日志” ,在单个RegionServer启动log并查看log末尾 (请求队列消耗内存)。
Hadoop + ZK + HBase 环境搭建的更多相关文章
- hadoop集群环境搭建之zookeeper集群的安装部署
关于hadoop集群搭建有一些准备工作要做,具体请参照hadoop集群环境搭建准备工作 (我成功的按照这个步骤部署成功了,经实际验证,该方法可行) 一.安装zookeeper 1 将zookeeper ...
- hadoop集群环境搭建之安装配置hadoop集群
在安装hadoop集群之前,需要先进行zookeeper的安装,请参照hadoop集群环境搭建之zookeeper集群的安装部署 1 将hadoop安装包解压到 /itcast/ (如果没有这个目录 ...
- 分享知识-快乐自己:Liunx-大数据(Hadoop)初始化环境搭建
大数据初始化环境搭建: 一):大数据(hadoop)初始化环境搭建 二):大数据(hadoop)环境搭建 三):运行wordcount案例 四):揭秘HDFS 五):揭秘MapReduce 六):揭秘 ...
- hadoop集群环境搭建准备工作
一定要注意hadoop和linux系统的位数一定要相同,就是说如果hadoop是32位的,linux系统也一定要安装32位的. 准备工作: 1 首先在VMware中建立6台虚拟机(配置默认即可).这是 ...
- 《Programming Hive》读书笔记(一)Hadoop和hive环境搭建
<Programming Hive>读书笔记(一)Hadoop和Hive环境搭建 先把主要的技术和工具学好,才干更高效地思考和工作. Chapter 1.Int ...
- Hadoop集群环境搭建步骤说明
Hadoop集群环境搭建是很多学习hadoop学习者或者是使用者都必然要面对的一个问题,网上关于hadoop集群环境搭建的博文教程也蛮多的.对于玩hadoop的高手来说肯定没有什么问题,甚至可以说事“ ...
- 【转】Hadoop HDFS分布式环境搭建
原文地址 http://blog.sina.com.cn/s/blog_7060fb5a0101cson.html Hadoop HDFS分布式环境搭建 最近选择给大家介绍Hadoop HDFS系统 ...
- 【Hadoop离线基础总结】CDH版本Hadoop 伪分布式环境搭建
CDH版本Hadoop 伪分布式环境搭建 服务规划 步骤 第一步:上传压缩包并解压 cd /export/softwares/ tar -zxvf hadoop-2.6.0-cdh5.14.0.tar ...
- hadoop ——完全分布式环境搭建
hadoop 完全分布式环境搭建 1.虚拟机角色分配: 192.168.44.184 hadoop02 NameNode/DataNode ResourceManager/NodeManager 19 ...
随机推荐
- 使用JQuery获取对象的几种方式(转)
原文:http://51876.iteye.com/blog/1350358 1.先讲讲JQuery的概念 JQuery首先是由一个 America 的叫什么 John Resig的人创建的,后来又很 ...
- NUMA导致的MySQL服务器SWAP问题分析与解决方案
[SWAP产生原理] 先从swap产生的原理来分析,由于linux内存管理比较复杂,下面以问答的方式列了一些重要的点,方便大家理解: 1.swap是如何产生的 swap指的是一个交换分区或文件,主要是 ...
- 循序渐进学.Net Core Web Api开发系列【8】:访问数据库(基本功能)
系列目录 循序渐进学.Net Core Web Api开发系列目录 本系列涉及到的源码下载地址:https://github.com/seabluescn/Blog_WebApi 一.概述 本篇讨论如 ...
- 【WIN10】移植opencc到WIN10-UWP,實現自己的繁簡轉換工具
花了週末兩天時間,將opencc移植成WIN10-UWP可用的庫,並完成自己的繁簡轉換工具. 我的繁簡轉換工具下載地址為:https://www.microsoft.com/store/apps/9n ...
- 【2017 4 24 - B】 组合数
[题目描述] 输入格式: 一行一个正整数n 输出格式: 一行一个数f(n)对1000000007取余的值 [分析] 就是乱搞?? 就是问根到叶子有多少条路径嘛. 然后路径可以π.1.1.π...这样表 ...
- 1722 最优乘车 1997年NOI全国竞赛
题目描述 Description H城是一个旅游胜地,每年都有成千上万的人前来观光.为方便游客,巴士公司在各个旅游景点及宾馆,饭店等地都设置了巴士站并开通了一些单程巴上线路.每条单程巴士线路从某个巴士 ...
- 负载均衡介绍及Nginx简单实现
负载均衡介绍及Nginx简单实现 负载均衡 负载均衡介绍及Nginx简单实现 1. 介绍 2. 常用的开源软件 2.1 LVS 优点 缺点 2.2 Nginx 优点 缺点 3. 常用的开源反向代理软件 ...
- 12、Redis的事务
写在前面的话:读书破万卷,编码如有神 --------------------------------------------------------------------------------- ...
- Codeforces Round #281 (Div. 2) A. Vasya and Football 模拟
A. Vasya and Football 题目连接: http://codeforces.com/contest/493/problem/A Description Vasya has starte ...
- UVALive 6889 City Park 并查集
City Park 题目连接: http://acm.hust.edu.cn/vjudge/contest/view.action?cid=122283#problem/F Description P ...