【原创】大数据基础之Ambari(4)通过Ambari部署Impala
ambari2.7.3(hdp3.1) 安装 impala2.12(自动安装最新)
ambari的hdp中原生不支持impala安装,下面介绍如何通过mpack方式使ambari支持impala安装:
一 安装Service
1 下载
# wget https://github.com/cas-bigdatalab/ambari-impala-service/raw/master/ambari-impala-mpack-2.6.0-0816.tar.gz
2 安装
# ambari-server install-mpack --mpack=/path/to/ambari-impala-mpack-2.6.0-0816.tar.gz --verbose
3 重启
# ambari-server restart
访问ambari页面发现services中并没有impala,检查mpack.json发现问题:
/var/lib/ambari-server/resources/mpacks/ambari-impala-mpack-2.6.0-0816/mpack.json
{"name": "ambari-impala-mpack", "prerequisites": {"max-ambari-version": "", "min-ambari-version": "2.4.0.0"}, "artifacts": [{"type": "service-definitions", "name": "ambari-common-services", "source_dir": "common-services"}, {"service_versions_map": [{"service_name": "IMPALA", "applicable_stacks": [{"stack_name": "HDP", "stack_version": "2.6"}], "service_version": "2.6.0"}], "type": "stack-addon-service-definitions", "name": "ambari-addon-services", "source_dir": "addon-services"}], "version": "2.6.0-0816", "type": "full-release", "description": "Ambari Management Pack ambari-impala-mpack"}
修改hdp版本,增加3.1
{"stack_name": "HDP", "stack_version": "2.6"}
重新打包ambari-impala-mpack-2.6.0-0816.tar.gz
注意:impala版本不用修改,默认会安装最新的;
4 卸载
ambari-server uninstall-mpack --mpack-name=ambari-impala-mpack
5 安装(同上)
6 重启(同上)
二 安装Impala
按照页面提示进行,要增加repo
# cat /etc/yum.repos.d/impala.repo
[cloudera-cdh5]
# Packages for Cloudera's Distribution for Hadoop, Version 5, on RedHat or CentOS 7 x86_64
name=Cloudera's Distribution for Hadoop, Version 5
baseurl=https://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/5/
gpgkey =https://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/RPM-GPG-KEY-cloudera
gpgcheck = 1
注意这里是redhat7,按照需要将7改为6也可以;
三 启动Impala
impala-state-store、impala-catalog、impala-server全部启动失败,OMG
下面看各种报错:
1 看日志报错
# more /var/log/impala/impala-catalog.log
find: '/usr/local/jdk/': No such file or directory
find: '/usr/local/jdk/': No such file or directory
find: '/usr/local/jdk/': No such file or directory/usr/lib/impala/sbin/impalad: error while loading shared libraries: libjsig.so: cannot open shared object file: No such file or directory
增加链接
ln -s /data/jdk1.8.0_191 /usr/local/jdk
这次impala-state-store可以启动,但是impala-catalog和impala-server还是报错;
2 各种ClassNotFoundException
ClassNotFoundException: org.apache.hadoop.conf.Configurationjava.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
检查lib目录:/usr/lib/impala/lib/*
lrwxrwxrwx 1 root root 43 Jan 19 07:34 hadoop-common.jar -> /usr/hdp/None/hadoop/hadoop-common.jar
发现很多链接有问题,指向一个不存在的目录/usr/hdp/None,实际为/usr/hdp/3.1.0.0-78,将这些链接全部替换为正确的地址:
cd /usr/lib/impala/lib
# ls -l|grep None|awk '{print "/bin/rm "$9}'
# ls -l|grep None|awk '{print "ln -s "$9" "$11}'|sed 's/None/3.1.0.0-78/g'
然后换了个错:
ClassNotFoundException: com.ctc.wstx.io.InputBootstrapperjava.lang.NoClassDefFoundError: com/ctc/wstx/io/InputBootstrapper
Caused by: java.lang.ClassNotFoundException: com.ctc.wstx.io.InputBootstrapper
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
这个类在包woodstox-core-5.0.3.jar中,从hadoop2.9版本开始引入,因为hdp3.1内置的hadoop是3.1.1,尝试将这个jar包放到/usr/lib/impala/lib/,结果错误依旧,同时考虑到这个目录下hadoop相关的jar都没有版本号,猜测impala可能直接根据一些写死的jar名字来加载,所以后添加的jar识别不到,只能降hadoop版本
hadoop2.7.2
hadoop-auth.jar
hadoop-common.jar
hadoop-hdfs.jar
hadoop-mapreduce-client-core.jarhive2.1.0
hive-common.jar
hive-metastore.jarhbase1.2.6
hbase-common.jar
hbase-client.jar
将以上版本的jar更新到/usr/lib/impala/lib/下,ClassNotFoundException消失;
3 权限
E0119 18:16:24.230655 29797 impala-server.cc:285] Invalid short-circuit reads configuration:
- Impala cannot read or execute the parent directory of dfs.domain.socket.path
查看hdfs配置dfs.domain.socket.path对应的目录
ls -l /var/lib
drwxr-x--x 3 hdfs hadoop 36 Jan 19 17:05 hadoop-hdfs
由于impala是impala用户启动的,所以没有权限
# usermod -G hadoop impala
将impala加到hadoop组解决
4 NoSuchMethodError
# cat /var/log/impala/catalogd.ERROR
F0119 18:21:35.781538 16153 catalog.cc:90] java.lang.NoSuchMethodError: org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(Lorg/apache/hadoop/hive/conf/HiveConf;Lorg/apache/hadoop/hive/metastore/HiveM
etaHookLoader;Ljava/lang/String;)Lorg/apache/hadoop/hive/metastore/IMetaStoreClient;
at org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.<init>(MetaStoreClientPool.java:93)
at org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.<init>(MetaStoreClientPool.java:72)
at org.apache.impala.catalog.MetaStoreClientPool.initClients(MetaStoreClientPool.java:168)
at org.apache.impala.catalog.Catalog.<init>(Catalog.java:100)
at org.apache.impala.catalog.CatalogServiceCatalog.<init>(CatalogServiceCatalog.java:263)
at org.apache.impala.service.JniCatalog.<init>(JniCatalog.java:121)
报错的类是hive metastore相关的类,查看
# javap -cp hive-metastore.jar org.apache.hadoop.hive.metastore.RetryingMetaStoreClient
Compiled from "RetryingMetaStoreClient.java"
public class org.apache.hadoop.hive.metastore.RetryingMetaStoreClient implements java.lang.reflect.InvocationHandler {
protected org.apache.hadoop.hive.metastore.RetryingMetaStoreClient(org.apache.hadoop.hive.conf.HiveConf, java.lang.Class<?>[], java.lang.Object[], java.util.concurrent.ConcurrentHashMap<java.lang.String, java.lang.Long>, java.lang.Class<? extends org.apache.hadoop.hive.metastore.IMetaStoreClient>) throws org.apache.hadoop.hive.metastore.api.MetaException;
public static org.apache.hadoop.hive.metastore.IMetaStoreClient getProxy(org.apache.hadoop.hive.conf.HiveConf, boolean) throws org.apache.hadoop.hive.metastore.api.MetaException;
public static org.apache.hadoop.hive.metastore.IMetaStoreClient getProxy(org.apache.hadoop.hive.conf.HiveConf, org.apache.hadoop.hive.metastore.HiveMetaHookLoader, java.lang.String) throws org.apache.hadoop.hive.metastore.api.MetaException;
public static org.apache.hadoop.hive.metastore.IMetaStoreClient getProxy(org.apache.hadoop.hive.conf.HiveConf, org.apache.hadoop.hive.metastore.HiveMetaHookLoader, java.util.concurrent.ConcurrentHashMap<java.lang.String, java.lang.Long>, java.lang.String, boolean) throws org.apache.hadoop.hive.metastore.api.MetaException;
public static org.apache.hadoop.hive.metastore.IMetaStoreClient getProxy(org.apache.hadoop.hive.conf.HiveConf, java.lang.Class<?>[], java.lang.Object[], java.lang.String) throws org.apache.hadoop.hive.metastore.api.MetaException;
public static org.apache.hadoop.hive.metastore.IMetaStoreClient getProxy(org.apache.hadoop.hive.conf.HiveConf, java.lang.Class<?>[], java.lang.Object[], java.util.concurrent.ConcurrentHashMap<java.lang.String, java.lang.Long>, java.lang.String) throws org.apache.hadoop.hive.metastore.api.MetaException;
public java.lang.Object invoke(java.lang.Object, java.lang.reflect.Method, java.lang.Object[]) throws java.lang.Throwable;
static {};
}
方法签名是存在的,再仔细分析一下NoSuchMethodError,这个错误并不是说方法不存在,否则编译都过不了,而是因为jar包冲突有两个相同的类,一个有这个方法,一个没有,运行时一旦加载第二个类的方法就会报错NoSuchMethodError,再检查其他jar果然发现问题
# javap -cp hive-exec.jar org.apache.hadoop.hive.metastore.RetryingMetaStoreClient
Compiled from "RetryingMetaStoreClient.java"
public class org.apache.hadoop.hive.metastore.RetryingMetaStoreClient implements java.lang.reflect.InvocationHandler {
protected org.apache.hadoop.hive.metastore.RetryingMetaStoreClient(org.apache.hadoop.conf.Configuration, java.lang.Class<?>[], java.lang.Object[], java.util.concurrent.ConcurrentHashMap<java.lang.String, java.lang.Long>, java.lang.Class<? extends org.apache.hadoop.hive.metastore.IMetaStoreClient>) throws org.apache.hadoop.hive.metastore.api.MetaException;
public static org.apache.hadoop.hive.metastore.IMetaStoreClient getProxy(org.apache.hadoop.conf.Configuration, boolean) throws org.apache.hadoop.hive.metastore.api.MetaException;
public static org.apache.hadoop.hive.metastore.IMetaStoreClient getProxy(org.apache.hadoop.conf.Configuration, org.apache.hadoop.hive.metastore.HiveMetaHookLoader, java.lang.String) throws org.apache.hadoop.hive.metastore.api.MetaException;
public static org.apache.hadoop.hive.metastore.IMetaStoreClient getProxy(org.apache.hadoop.conf.Configuration, org.apache.hadoop.hive.metastore.HiveMetaHookLoader, java.util.concurrent.ConcurrentHashMap<java.lang.String, java.lang.Long>, java.lang.String, boolean) throws org.apache.hadoop.hive.metastore.api.MetaException;
public static org.apache.hadoop.hive.metastore.IMetaStoreClient getProxy(org.apache.hadoop.conf.Configuration, java.lang.Class<?>[], java.lang.Object[], java.lang.String) throws org.apache.hadoop.hive.metastore.api.MetaException;
public static org.apache.hadoop.hive.metastore.IMetaStoreClient getProxy(org.apache.hadoop.conf.Configuration, java.lang.Class<?>[], java.lang.Object[], java.util.concurrent.ConcurrentHashMap<java.lang.String, java.lang.Long>, java.lang.String) throws org.apache.hadoop.hive.metastore.api.MetaException;
public java.lang.Object invoke(java.lang.Object, java.lang.reflect.Method, java.lang.Object[]) throws java.lang.Throwable;
static org.apache.hadoop.hive.metastore.IMetaStoreClient access$000(org.apache.hadoop.hive.metastore.RetryingMetaStoreClient);
static {};
}
在hive-exec.jar中也有一个RetryingMetaStoreClient ,继续替换jar
hive2.1.0
hive-exec.jar
5 NoClassDefFoundError
F0119 18:59:40.175907 17009 catalog.cc:90] java.lang.IllegalStateException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.<init>(MetaStoreClientPool.java:99)
at org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.<init>(MetaStoreClientPool.java:72)
at org.apache.impala.catalog.MetaStoreClientPool.initClients(MetaStoreClientPool.java:168)
at org.apache.impala.catalog.Catalog.<init>(Catalog.java:100)
at org.apache.impala.catalog.CatalogServiceCatalog.<init>(CatalogServiceCatalog.java:263)
at org.apache.impala.service.JniCatalog.<init>(JniCatalog.java:121)
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1627)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:80)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:130)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:101)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:94)
at org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.<init>(MetaStoreClientPool.java:93)
... 5 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedConstructorAccessor5.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1625)
... 10 more
Caused by: java.lang.NoClassDefFoundError: org/datanucleus/PersistenceNucleusContext
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.getClass(MetaStoreUtils.java:1592)
at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:64)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:581)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:546)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:608)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:398)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:78)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:84)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6396)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:236)
... 14 more
查看/usr/lib/impala/lib/下有datanucleus相关jar:
datanucleus-api-jdo-3.2.6.jar
datanucleus-core-3.2.12.jar
datanucleus-rdbms-3.2.12.jar
通过javap检查这几个jar包都没有org.datanucleus.PersistenceNucleusContext,再看hive2.1.0依赖的版本是
datanucleus-core-4.1.6.jar
而且不能通过添加jar到/usr/lib/impala/lib/来解决,只能降hive版本到1.2.1,因为这个版本的hive用是的datanucleus3,至此问题全部解决,impala-catalog和impala-server都成功启动;
6)通过hue访问impala server报错
NoClassDefFoundError: org/apache/hive/service/cli/thrift/TGetInfoReq CAUSED BY: ClassNotFoundException: org.apache.hive.service.cli.thrift.TGetInfoReq
这个类在hive-service-1.2.1.jar中,所以还要替换
hive1.2.0
hive-service.jar
注意hue中连接impala一定要使用21050端口,不能像impala-shell一样使用21000端口,否则会报错
Invalid method name 'OpenSession'
最后汇总一下需要替换的jar
hadoop2.7.2
hadoop-auth.jar
hadoop-common.jar
hadoop-hdfs.jar
hadoop-mapreduce-client-core.jarhive1.2.0
hive-common.jar
hive-metastore.jar
hive-exec.jar
hive-service.jarhbase1.2.6
hbase-common.jar
hbase-client.jar
参考:
https://github.com/cas-bigdatalab/ambari-impala-service
【原创】大数据基础之Ambari(4)通过Ambari部署Impala的更多相关文章
- 【原创】大数据基础之Kerberos(2)hive impala hdfs访问
1 hive # kadmin.local -q 'ktadd -k /tmp/hive3.keytab -norandkey hive/server03@TEST.COM'# kinit -kt / ...
- 大数据基础环境--jdk1.8环境安装部署
1.环境说明 1.1.机器配置说明 本次集群环境为三台linux系统机器,具体信息如下: 主机名称 IP地址 操作系统 hadoop1 10.0.0.20 CentOS Linux release 7 ...
- 【原创】大数据基础之Zookeeper(2)源代码解析
核心枚举 public enum ServerState { LOOKING, FOLLOWING, LEADING, OBSERVING; } zookeeper服务器状态:刚启动LOOKING,f ...
- 【原创】大数据基础之Impala(1)简介、安装、使用
impala2.12 官方:http://impala.apache.org/ 一 简介 Apache Impala is the open source, native analytic datab ...
- 【原创】大数据基础之Ambari(1)简介、编译安装、使用
官方:http://ambari.apache.org/ The Apache Ambari project is aimed at making Hadoop management simpler ...
- 【原创】大数据基础之词频统计Word Count
对文件进行词频统计,是一个大数据领域的hello word级别的应用,来看下实现有多简单: 1 Linux单机处理 egrep -o "\b[[:alpha:]]+\b" test ...
- 【原创】大数据基础之Benchmark(2)TPC-DS
tpc 官方:http://www.tpc.org/ 一 简介 The TPC is a non-profit corporation founded to define transaction pr ...
- 大数据基础知识问答----hadoop篇
handoop相关知识点 1.Hadoop是什么? Hadoop是一个由Apache基金会所开发的分布式系统基础架构.用户可以在不了解分布式底层细节的情况下,开发分布式程序.充分利用集群的威力进行高速 ...
- 大数据基础知识:分布式计算、服务器集群[zz]
大数据中的数据量非常巨大,达到了PB级别.而且这庞大的数据之中,不仅仅包括结构化数据(如数字.符号等数据),还包括非结构化数据(如文本.图像.声音.视频等数据).这使得大数据的存储,管理和处理很难利用 ...
- 大数据基础知识问答----spark篇,大数据生态圈
Spark相关知识点 1.Spark基础知识 1.Spark是什么? UCBerkeley AMPlab所开源的类HadoopMapReduce的通用的并行计算框架 dfsSpark基于mapredu ...
随机推荐
- mysql 5.7 json
项目中使用的mysql5.6数据库,数据库表一张表中存的字段为blob类型的json串数据.性能压测中涉及该json串处理效率比较低,开发人员提到mysql5.7版本后json串提供了原生态的json ...
- JS 字符串处理相关(持续更新)
一.JS判断字符串中是否包含某个字符串 indexOf() indexOf()方法可返回某个指定的字符串值在字符串中首次出现的位置.如果要检索的字符串值没有出现,则该方法返回 -1. var str ...
- springboot2.0整合shiro出现ShiroDialect报错 找不到org/thymeleaf/processor/attr/AbstractTextChildModifierAttrPr
包版本过低,找最新包 https://mvnrepository.com/ <dependency> <groupId>com.github.theborakompanioni ...
- ABP中的拦截器之EntityHistoryInterceptor
今天我们接着之前的系列接着来写另外一种拦截器EntityHistoryInterceptor,这个拦截器到底是做什么的呢?这个从字面上理解是实体历史?这个到底是什么意思?带着这个问题我们来一步步去分析 ...
- DAY11、函数总结
一.函数的对象 1.函数对象:函数名存放的就是函数的地址,所以函数名也是对像 2.函数对象的应用: 2.1.可以直接被引用 fn = cp_fn 2.2.可以当作函数参数传递 compute ...
- Python——Django-urls.py的作用
一.urls.py的作用是保存路径和函数的对应关系 二.函数返回指定内容 from django.urls import path #引用HTTP协议的代码 from django.shortcuts ...
- 获取data-*属性值
下面就详细介绍四种方法获取data-*属性的值 <li id=">获取id</li> 需要获取的就是data-id 和 dtat-vice-id的值 一:getAtt ...
- PhotoShop不用魔棒、钢笔 建立较平整的选区 P进电脑屏幕里
不用魔棒.抽出.钢笔等,还可以直接变形图建立调整选区,这种方法比钢笔抽出感觉简单一些,比魔棒仔细一些. 抽出或钢笔:抽出弄错了偏移了还要擦除,调整笔刷,抽出后可能还有毛边,需要用橡皮擦除: 钢笔,错了 ...
- python之生成器和列表推导式
一.生成器函数 1.生成器 就是自己用python代码写的迭代器,生成器的本质就是迭代器(所以自带了__iter__方法和__next__方法,不需要我们去实现). 2.构建生成器的两种方式 1,生成 ...
- mpvue——引入echarts打包vendor过大
前言 有一个项目需要引入图表,当时有两种选择一种是mpvue-echarts,一种是F2,而我经过踩坑之后依然决然的选择了mpvue-echarts,简单快捷容易上手,主要之前用过比较熟悉. 问题 | ...