A record--Offline deployment of Big Data Platform CDH Cluster

Tags: Cloudera-Manager CDH Hadoop Deploy Cluster

Abstract: Deployment and Management of Hadoop clusters need tools, such as Cloudera Manager. In this article, I compare the tools briefly, and then record the step of deploying CDH cluster offline in detail. Finally, I expound the theory of 'handle delicately'.


Preface

The emergence of Big Data technology led by Apache Hadoop, makes small and medium-sized enterprises also have the ability to handle the storage and processing of big data. At present, there'r lots of Hadoop distributions, such as HUAWEI Distribution, Intel Distribution, Cloudera’s Distribution Including Apache Hadoop (CDH free), and Hortonworks Data Platform (HDP free), etc. All of these are based on the Hadoop Apache Community Edition.

The deployment and management of a Hadoop cluster which has tens or more nodes needs advanced tools. Apache Ambari from Hortonworks is this kind of tools, it provided an easy-to-use RESTfull web site to manage Hadoop. Cloudera also provided a similar tool, Cloudera Manager(CM) to configure, monitor and manage CDH clusters.

The main content of this paper is a record of building a CDH cluster. Special attention is required to choose Cloudera Manager version, which depends on Operating system, el7 isn't supported by Cloudera Manager at this moment. You should follow [the official document][1], otherwise the installation will run into a stone wall.

This paper is based on CentOS 6.5, 64-bit;Cloudera Manager 5.3.6;JDK 1.7.


Deploy CDH

Configure network (All nodes)

  1. [root@cdh-server ~]# vi /etc/sysconfig/network #修改hostname:
  2. NETWORKING=yes
  3. HOSTNAME=cdh-server
  4. [root@cdh-server ~]# vi /etc/hosts #修改ip与主机名的对应关系:
  5. 192.168.180.173 cdh-server
  6. 192.168.180.175 node175
  7. [root@cdh-server ~]# service network restart #重启网络服务生效

Install JDK (All nodes)

  1. #卸载OpenJDK
  2. [root@cdh-server user1]# rpm -qa | grep java
  3. [root@cdh-server user1]# rpm -e --nodeps java-1.5.0-gcj-1.5.0.0-29.1.el6.x86_64
  4. [root@cdh-server user1]# rpm -e --nodeps java-1.6.0-openjdk-1.6.0.0-1.66.1.13.0.el6.x86_64
  5. [root@cdh-server user1]# rpm -e --nodeps java-1.7.0-openjdk-1.7.0.45-2.4.3.3.el6.x86_64
  6. #安装JDK
  7. [root@cdh-server user1]# chmod a+x jdk-7u79-linux-x64.rpm
  8. [root@cdh-server user1]# rpm -ivh jdk-7u79-linux-x64.rpm
  9. [root@cdh-server user1]# echo "JAVA_HOME=/usr/java/jdk1.7.0_79/" >>

Install MySQL (Master)

  1. [user1@cdh-server]$ cd /home/user1
  2. [user1@cdh-server]$ tar -zxvf mysql-5.6.26-linux-glibc2.5-x86_64.tar.gz
  3. [user1@cdh-server]$ mv mysql-5.6.26-linux-glibc2.5-x86_64 mysql-5.6.26
  4. [user1@cdh-server]$ cd mysql-5.6.26/
  5. [user1@cdh-server]$ vi support-files/my.cnf #新建文件
  1. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  2. [mysqld]
  3. character-set-server=utf8
  4. default-storage-engine=INNODB
  5. # Uncomment the following if you are using InnoDB tables
  6. innodb_data_home_dir = /home/user1/mysql-5.6.26/data
  7. innodb_data_file_path = ibdata1:10M:autoextend
  8. innodb_log_group_home_dir = /home/user1/mysql-5.6.26/data
  9. # You can set .._buffer_pool_size up to 50 - 80 %
  10. # of RAM but beware of setting memory usage too high
  11. innodb_buffer_pool_size = 16M
  12. innodb_additional_mem_pool_size = 2M
  13. # Set .._log_file_size to 25 % of buffer pool size
  14. innodb_log_file_size = 5M
  15. innodb_log_buffer_size = 8M
  16. innodb_flush_log_at_trx_commit = 1
  17. innodb_lock_wait_timeout = 50
  18. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Initialize MySQL (Master)

  1. [user1@cdh-server]$ ./scripts/mysql_install_db --defaults-file=/home/user1/mysql-5.6.26/support-files/my.cnf --basedir=/home/user1/mysql-5.6.26 --datadir=/home/user1/mysql-5.6.26/data --user=user1
  2. [user1@cdh-server]$ ./bin/mysqld --defaults-file=/home/user1/mysql-5.6.26/support-files/my.cnf --basedir=/home/user1/mysql-5.6.26 --datadir=/home/user1/mysql-5.6.26/data > mysql.log 2>&1 &
  3. [user1@cdh-server]$ ./bin/mysqladmin -u root password '123456'
  1. [user1@cdh-server mysql-5.6.26]$ ./bin/mysql -uroot -p'123456'
  2. #hive
  3. mysql> create database hive DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
  4. Query OK, 1 row affected (0.00 sec)
  5. #Activity Monitor使用
  6. mysql> create database amon DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
  7. Query OK, 1 row affected (0.01 sec)
  8. #Navigator Audit Server使用
  9. mysql> create database audit DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
  10. Query OK, 1 row affected (0.01 sec)
  11. #Navigator Metadata Server
  12. mysql> create database metadata DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
  13. Query OK, 1 row affected (0.01 sec)
  14. mysql> grant all privileges on *.* to 'root'@'localhost' identified by '123456' with grant option;
  15. Query OK, 0 rows affected (0.00 sec)
  16. mysql> grant all privileges on *.* to 'root'@'cdh-server' identified by '123456' with grant option;
  17. Query OK, 0 rows affected (0.00 sec)
  18. #this user scm is for cloudera manager
  19. mysql> grant all privileges on *.* to 'scm'@'localhost' identified by 'scm' with grant option;
  20. Query OK, 0 rows affected (0.00 sec)
  21. mysql> grant all privileges on *.* to 'scm'@'cdh-server' identified by 'scm' with grant option;
  22. Query OK, 0 rows affected (0.00 sec)
  23. mysql> flush privileges;
  24. Query OK, 0 rows affected (0.00 sec)

Deploy & Start CM-Server (Master)

  1. [user1@cdh-server ~]$ tar -zxvf cloudera-manager-el6-cm5.3.6_x86_64.tar.gz
  2. [user1@cdh-server ~]$ cp mysql-connector-java-5.1.33-bin.jar ./cm-5.3.6/share/cmf/lib/
  3. [user1@cdh-server ~]$ su - root
  4. [root@cdh-server ~]# cd /home/user1/
  5. [root@cdh-server user1]# cp -rf cloudera /opt
  6. [root@cdh-server user1]# mv CDH-5.3.6-1.cdh5.3.6.p0.11-el6.parcel /opt/cloudera/parcel-repo/CDH-5.3.6-1.cdh5.3.6.p0.11-el6.parcel
  7. [root@cdh-server user1]# mv CDH-5.3.6-1.cdh5.3.6.p0.11-el6.parcel.sha /opt/cloudera/parcel-repo/CDH-5.3.6-1.cdh5.3.6.p0.11-el6.parcel.sha
  8. [root@cdh-server user1]# mv manifest.json /opt/cloudera/parcel-repo/manifest.json
  9. [root@cdh-server user1]# ./cm-5.3.6/share/cmf/schema/scm_prepare_database.sh mysql cm -hlocalhost:3306 -uroot -p123456 --scm-host localhost scm scm scm
  10. [root@cdh-server user1]# ./cm-5.3.6/etc/init.d/cloudera-scm-server start
  11. Starting cloudera-scm-server: [ OK ]
  12. [root@cdh-server user1]# tail -f ./cm-5.3.6/log/cloudera-scm-server/cloudera-scm-server.log

Stop iptables (All nodes)

  1. #停止iptables
  2. [root@cdh-server user1]# service iptables stop
  3. #通过浏览器访问验证
  4. http://192.168.180.173:7180/

Deploy & Start CM-Agent (Slaves)

  1. [root@cdh-server user1]# tar -zxvf cloudera-manager-el6-cm5.3.6_x86_64.tar.gz
  2. [root@cdh-server user1]# vi cm-5.3.6/etc/cloudera-scm-agent/config.ini
  1. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  2. # Hostname of the CM server.
  3. #server_host=localhost
  4. server_host=cdh-server
  5. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  1. [root@cdh-server user1]# useradd -G sys --home=/home/user1/cm-5.3.6/run/cloudera-scm-server --no-create-home --comment "Cloudera SCM User" cloudera-scm
  2. [root@cdh-server user1]# useradd --comment "Cloudera SCM User" cloudera-scm #若上一步执行正确,则此步省略
  3. [root@cdh-server user1]# echo 0 > /proc/sys/vm/swappiness
  4. [root@cdh-server user1]# ./cm-5.3.6/etc/init.d/cloudera-scm-agent start
  5. Starting cloudera-scm-agent: [ OK ]
  6. [root@cdh-server user1]# tail -f ./cm-5.3.6/log/cloudera-scm-agent/cloudera-scm-agent.log

Configure CDH

Load Cloudera Manager http://192.168.180.173:7180/, then create a new Cluster names Cluster_user1,Start and configure Services.

  1. #安装配置hive出错时,在hiveServer上:
  2. [root@hive-server user1]# cp mysql-connector-java-5.1.33-bin.jar /opt/cloudera/parcels/CDH-5.3.6-1.cdh5.3.6.p0.11/lib/hive/lib/
  3. #同理:use this jar for Navigator Audit Server and Navigator Metadata Server or Activity Server
  4. [root@cdh-server user1]# cp mysql-connector-java-5.1.33-bin.jar /usr/share/java/mysql-connector-java.jar

Others

Stop CDH

  • Stop Cloudera Management Service & Cluster_user1
  • Stop Agent (Slaves)
  1. [root@cdh-server user1]# ./cm-5.3.6/etc/init.d/cloudera-scm-agent stop
  • Stop Server (Master)
  1. [root@cdh-server user1]# ./cm-5.3.6/etc/init.d/cloudera-scm-server stop

Start CDH

  • Start MySQL (Master)
  1. [user1@cdh-server]$ ./bin/mysqld --defaults-file=/home/user1/mysql-5.6.26/support-files/my.cnf --basedir=/home/user1/mysql-5.6.26 --datadir=/home/user1/mysql-5.6.26/data > mysql.log 2>&1 &
  • Start Agent (Slaves)
  1. [root@cdh-server user1]# ./cm-5.3.6/etc/init.d/cloudera-scm-agent start
  2. Starting cloudera-scm-agent: [ OK ]
  3. [root@cdh-server user1]# tail -f ./cm-5.3.6/log/cloudera-scm-agent/cloudera-scm-agent.log
  • Start Server (Master)
  1. [root@cdh-server user1]# ./cm-5.3.6/etc/init.d/cloudera-scm-server start
  2. Starting cloudera-scm-server: [ OK ]
  3. [root@cdh-server user1]# tail -f ./cm-5.3.6/log/cloudera-scm-server/cloudera-scm-server.log

The theory of handle delicately

Handle delicately is a kind of feeling, but also a skill. When in seller's market, a company can make a profit while may needn't to handle delicately. However, if it wishes to pursue more, it will handle more delicately, such as pay more attention to detail, User Experience or others. When in buyer's market, the relation between supply and demand makes companies handle delicately to survive.

Handle delicately is not only the driving force pushing social to continually advance, but also the result of this advance. Today, lots of internet firms are in buyer's market, who has more users who will win the fight, delicately handling makes them at the forefront of social evolution and technological innovation.

Handle delicately doesn't only appear in companies, but also individuals, regions and countries, etc. The contry who handles more delicately, who is more developed. The company who pays more attention to details, who is more competitive. But, the one who handles more delicately doesn't mean it will make more profits, for various reasons.


Writer: @Angel Wang



aitanjupt@hotmail.com

2015 - 10 - 18

[1]: http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/pcm_os.html

果然cnblogs不支持目录啊。不过代码格式还算漂亮。

此篇文章是本人另一英文文章的版本,中文版在此:朝花夕拾之--大数据平台CDH集群离线搭建 http://www.cnblogs.com/wgp13x/p/4990484.html ,多谢指教!

A record--Offline deployment of Big Data Platform CDH Cluster的更多相关文章

  1. 《Toward an SDN-Enabled Big Data Platform for Social TV Analysis》--2015--Han Hu

    <面向应用于社会TV分析的应用了SDN的大数据平台> Abstract social TV analytics 是什么,就是说很多TV观众在微博.微信和推特等这些地方分享他们的观感时,然后 ...

  2. Tapdata 的 2.0 版 ,开源的 Live Data Platform 现已发布

    https://www.bilibili.com/video/BV1tT411g7PA/?aid=470724972&cid=766317673&page=1 点击上方链接,一分钟快速 ...

  3. Putting Apache Kafka To Use: A Practical Guide to Building a Stream Data Platform-part 1

    转自: http://www.confluent.io/blog/stream-data-platform-1/ These days you hear a lot about "strea ...

  4. Moving Computation is Cheaper than Moving Data

    https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html Introduction The Hadoop Distributed File Syst ...

  5. Linux command line exercises for NGS data processing

    by Umer Zeeshan Ijaz The purpose of this tutorial is to introduce students to the frequently used to ...

  6. HDFS relaxes a few POSIX requirements to enable streaming access to file system data

    https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html Introduction [ ...

  7. Publishing and Deployment >> Publishing to IIS 翻译

    Publishing to IIS  发布到IIS 2017/1/18 18 min to read Contributors  Supported operating systems 支持的操作系统 ...

  8. goldengate for big data 12.3发布

    主要新增特性:通用JDBC目标端:支持Amazon Redshift & IBM Netezza Oracle GoldenGate for Big Data 12.3现在支持通用的JDBC目 ...

  9. Principal Data Scientist

    http://stackoverflow.com/jobs/124781/principal-data-scientist-concur-technologies-inc?med=clc&re ...

随机推荐

  1. springmvc(4)注解简单了解

    对于我这样的新人来说,因为是刚开始做项目,所以以前的技术不是用的很多,就比如springmvc来说,实际上使用的都是注解形式的,对于那些全部都是配置的来说,虽然也了解一些,但是实际上还是没试用过的. ...

  2. struts 用拦截器进行用户权限隔离,未登录用户跳到登录界面 *** 最爱那水货

    一般,我们的web应用都是只有在用户登录之后才允许操作的,也就是说我们不允许非登录认证的用户直接访问某些页面或功能菜单项.对于个别页面来说,可能不需要进行拦截,此时,如果项目采用struts view ...

  3. Verilog学习笔记简单功能实现(五)...............序列检测设计

    这里采用夏宇闻教授第十五章的序列检测为例来学习; 从以上的状态转换图可以写出状态机的程序: module seqdet(x,out,clk,rst); input x,clk,rst; output ...

  4. 解决远程连接mysql很慢的方法(mysql_connect 打开连接慢)

    http://www.jb51.net/article/27616.htm   有次同事提出开发使用的mysql数据库连接很慢,因为我们的mysql开发数据库是单独一台机器部署的,所以认为可能是网络连 ...

  5. Bootstrap源码分析之nav、collapse

    导航分析(nav): 源码文件:_navs.scss:导航模块Mixins/_nav-divider.scss:分隔线Mixins/_nav-vertical-align.scss:垂直对齐 1.只是 ...

  6. JS去掉首尾空格 简单方法大全(原生正则jquery)

    JS去掉首尾空格 简单方法大全 var osfipin= ' http://www.cnblogs.com/osfipin/ '; //去除首尾空格 osfipin.replace(/(^\s*)|( ...

  7. <转>DevExpress使用经验总结

    DevExpress是一个比较有名的界面控件套件,提供了一系列的界面控件套件的DotNet界面控件.本文主要介绍我在使用 DevExpress控件过程中,遇到或者发现的一些问题解决方案,或者也可以所示 ...

  8. Sharepoint学习笔记—习题系列--70-576习题解析 -(Q16-Q18)

    Question 16 You are designing a SharePoint 2010 solution to manage statements of work. You need to d ...

  9. Learn RxJava

    Learn RxJava http://reactivex.io/documentation/operators.html https://github.com/ReactiveX/RxJava/wi ...

  10. 调用meitu秀秀.so文件实现美图功能

    本文属于实战系列,是对<Android C代码回调java方法>等文的实践,调用meitu秀秀的libmtimage-jni.so文件来实现图片的美化功能 首先反编译得到/libmtima ...