最近在测试HCatalog,由于Hcatalog本身就是一个独立JAR包,虽然它也可以运行service,但是其实这个service就是metastore thrift server,我们在写基于Hcatalog的mapreduce job时候只要把hcatalog JAR包和对应的hive-site.xml文件加入libjars和HADOOP_CLASSPATH中就可以了。不过在测试的时候还是遇到了一些问题,hive metastore server在运行了一段时间后会抛如下错误

2013-06-19 10:35:51,718 ERROR server.TThreadPoolServer (TThreadPoolServer.java:run(182)) - Error occurred during processing of message.
javax.jdo.JDOFatalUserException: Persistence Manager has been closed
at org.datanucleus.jdo.JDOPersistenceManager.assertIsOpen(JDOPersistenceManager.java:2124)
at org.datanucleus.jdo.JDOPersistenceManager.currentTransaction(JDOPersistenceManager.java:315)
at org.apache.hadoop.hive.metastore.ObjectStore.openTransaction(ObjectStore.java:294)
at org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:732)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
at com.sun.proxy.$Proxy5.getTable(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:982)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table.getResult(ThriftHiveMetastore.java:5017)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table.getResult(ThriftHiveMetastore.java:5005)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)

其中PersistenceManager负责控制一组持久化对象包括创建持久化对象和查询对象,它是ObjectStore的一个实例变量,每个ObjectStore拥有一个pm,RawStore是metastore逻辑层和物理底层元数据库(比如derby)交互的接口类,ObjectStore是RawStore的默认实现类。Hive Metastore Server启动的时候会指定一个TProcessor,包装了一个HMSHandler,内部有一个ThreadLocal<RawStore> threadLocalMS实例变量,每个thread维护一个RawStore

    private final ThreadLocal<RawStore> threadLocalMS =
new ThreadLocal<RawStore>() {
@Override
protected synchronized RawStore initialValue() {
return null;
}
};

每一个从hive metastore client过来的请求都会从线程池中分配一个
WorkerProcess来处理,在HMSHandler中每一个方法都会通过getMS()获取rawstore instance来做具体操作

    public RawStore getMS() throws MetaException {
RawStore ms = threadLocalMS.get();
if (ms == null) {
ms = newRawStore();
threadLocalMS.set(ms);
ms = threadLocalMS.get();
}
return ms;
}

看得出来RawStore是延迟加载,初始化后绑定到threadlocal变量中可以为以后复用

    private RawStore newRawStore() throws MetaException {
LOG.info(addPrefix("Opening raw store with implemenation class:"
+ rawStoreClassName));
Configuration conf = getConf(); return RetryingRawStore.getProxy(hiveConf, conf, rawStoreClassName, threadLocalId.get());
}

RawStore使用了动态代理模式(继承
InvocationHandler接口
),内部实现了invoke函数,通过method.invoke()执行真正的逻辑,这样的好处是可以在
method.invoke()上下文中添加自己其他的逻辑,RetryingRawStore就是在通过捕捉invoke函数抛出的异常,来达到重试的效果。由于使用reflection机制,异常是wrap在
InvocationTargetException中的,
不过在hive 0.9中竟然在捕捉到
此异常后直接throw出来了,而不是retry,明显不对啊。我对它修改了下,拿出wrap的target exception,判断是不是instance of jdoexception的,再做相应的处理

  @Override
public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
Object ret = null; boolean gotNewConnectUrl = false;
boolean reloadConf = HiveConf.getBoolVar(hiveConf,
HiveConf.ConfVars.METASTOREFORCERELOADCONF);
boolean reloadConfOnJdoException = false; if (reloadConf) {
updateConnectionURL(getConf(), null);
} int retryCount = 0;
Exception caughtException = null;
while (true) {
try {
if (reloadConf || gotNewConnectUrl || reloadConfOnJdoException) {
initMS();
}
ret = method.invoke(base, args);
break;
} catch (javax.jdo.JDOException e) {
caughtException = (javax.jdo.JDOException) e.getCause();
} catch (UndeclaredThrowableException e) {
throw e.getCause();
} catch (InvocationTargetException e) {
Throwable t = e.getTargetException();
if (t instanceof JDOException){
caughtException = (JDOException) e.getTargetException();
reloadConfOnJdoException = true;
LOG.error("rawstore jdoexception:" + caughtException.toString());
}else {
throw e.getCause();
}
} if (retryCount >= retryLimit) {
throw caughtException;
} assert (retryInterval >= 0);
retryCount++;
LOG.error(
String.format(
"JDO datastore error. Retrying metastore command " +
"after %d ms (attempt %d of %d)", retryInterval, retryCount, retryLimit));
Thread.sleep(retryInterval);
// If we have a connection error, the JDO connection URL hook might
// provide us with a new URL to access the datastore.
String lastUrl = getConnectionURL(getConf());
gotNewConnectUrl = updateConnectionURL(getConf(), lastUrl);
}
return ret;
}

初始化RawStore有两种方式,一种是在
RetryingRawStore的构造函数中调用"
this.base = (RawStore) ReflectionUtils.newInstance(rawStoreClass, conf);
"  因为ObjectStore实现了Configurable,在newInstance方法中主动调用里面的setConf(conf)方法初始化RawStore,还有一种情况是在捕捉到异常后retry,也会调用
base.setConf(getConf());

private void initMS() {
base.setConf(getConf());
}

ObjectStore的setConf方法中,先将PersistenceManagerFactory锁住,pm close掉,设置成NULL,再初始化pm

public void setConf(Configuration conf) {
// Although an instance of ObjectStore is accessed by one thread, there may
// be many threads with ObjectStore instances. So the static variables
// pmf and prop need to be protected with locks.
pmfPropLock.lock();
try {
isInitialized = false;
hiveConf = conf;
Properties propsFromConf = getDataSourceProps(conf);
boolean propsChanged = !propsFromConf.equals(prop); if (propsChanged) {
pmf = null;
prop = null;
} assert(!isActiveTransaction());
shutdown();
// Always want to re-create pm as we don't know if it were created by the
// most recent instance of the pmf
pm = null;
openTrasactionCalls = 0;
currentTransaction = null;
transactionStatus = TXN_STATUS.NO_STATE; initialize(propsFromConf); if (!isInitialized) {
throw new RuntimeException(
"Unable to create persistence manager. Check dss.log for details");
} else {
LOG.info("Initialized ObjectStore");
}
} finally {
pmfPropLock.unlock();
}
}
private void initialize(Properties dsProps) {
LOG.info("ObjectStore, initialize called");
prop = dsProps;
pm = getPersistenceManager();
isInitialized = pm != null;
return;
}

回到一开始报错的那段信息,怎么会Persistence Manager会被关闭呢,仔细排查后才发现是由于HCatalog使用HiveMetastoreClient用完后主动调用了close方法,而一般Hive里面内部不会调这个方法.

HiveMetaStoreClient.java

public void close() {
isConnected = false;
try {
if (null != client) {
client.shutdown();
}
} catch (TException e) {
LOG.error("Unable to shutdown local metastore client", e);
}
// Transport would have got closed via client.shutdown(), so we dont need this, but
// just in case, we make this call.
if ((transport != null) && transport.isOpen()) {
transport.close();
}
}

对应server端HMSHandler中的shutdown方法

@Override
public void shutdown() {
logInfo("Shutting down the object store...");
RawStore ms = threadLocalMS.get();
if (ms != null) {
ms.shutdown();
ms = null;
}
logInfo("Metastore shutdown complete.");
}

ObjectStore的shutdown方法

public void shutdown() {
if (pm != null) {
pm.close();
}
}

我们看到shutdown方法里面只是把当前thread的ObjectStore拿出来后,做了一个ObjectStore shutdown方法,把pm关闭了。但是并没有把ObjectStore销毁掉,它还是存在于threadLocalMS中,下次还是会被拿出来,下一次这个thread服务于另外一个请求的时候又会被get出ObjectSture来,但是由于里面的pm已经close掉了所以肯定抛异常。正确的做法是应该加上threadLocalMS.remove()或者threadLocalMS.set(null),主动将其从ThreadLocalMap中删除。

修改后的
shutdown方法

public void shutdown() {
logInfo("Shutting down the object store...");
RawStore ms = threadLocalMS.get();
if (ms != null) {
ms.shutdown();
ms = null;
threadLocalMS.remove();
}
logInfo("Metastore shutdown complete.");
}

Hive Metastore ObjectStore PersistenceManager自动关闭bug解析的更多相关文章

  1. hive Metastore contains multiple versions

    凌晨接到hive作业异常,hive版本为1.2.1,hadoop版本apache 2.7.1,元数据存储在mysql中,异常信息如下: Logging initialized using config ...

  2. Hive metastore整体代码分析及详解

    从上一篇对Hive metastore表结构的简要分析中,我再根据数据设计的实体对象,再进行整个代码结构的总结.那么我们先打开metadata的目录,其目录结构: 可以看到,整个hivemeta的目录 ...

  3. Hive Metastore 代码简析

    1.  hive metastore 内部结构 1.1 包结构 从package结构来看,主要的5个package,让我们来看看这几个package的内容 (1) metastorepackage是m ...

  4. Hive metastore表结构设计分析

    今天总结下,Hive metastore的结构设计.什么是metadata呢,对于它的描述,可以理解为数据的数据,主要是描述数据的属性的信息.它是用来支持如存储位置.历史数据.资源查找.文件记录等功能 ...

  5. Hive metastore源码阅读(一)

    不要问我为什么,因为爱,哈哈哈哈...进入正题,最近做项目顺带学习了下hive metastore的源码,进行下知识总结. hive metastore的整体架构如图: 一.组成结构: 如图我们可以看 ...

  6. Hive安装配置指北(含Hive Metastore详解)

    个人主页: http://www.linbingdong.com 本文介绍Hive安装配置的整个过程,包括MySQL.Hive及Metastore的安装配置,并分析了Metastore三种配置方式的区 ...

  7. Hive metastore三种配置方式

    http://blog.csdn.net/reesun/article/details/8556078 Hive的meta数据支持以下三种存储方式,其中两种属于本地存储,一种为远端存储.远端存储比较适 ...

  8. Hadoop之Hive(2)--配置Hive Metastore

    Hive metastore服务以关系性数据库的方式存储Hive tables和partitions的metadata,并且提供给客户端访问这些数据的metastore service的API.下面介 ...

  9. hive metastore异常 org.apache.thrift.protocol.TProtocolException: Missing version in readMessageBegin, old client

    hiveserver2的端口是10000hive.metastoe.uris 的端口9083改为10000之后 beelien 连接hiveserver2报错 Error: Could not ope ...

随机推荐

  1. Netty源代码学习——ChannelPipeline模型分析

    參考Netty API io.netty.channel.ChannelPipeline A list of ChannelHandlers which handles or intercepts i ...

  2. EasyUI - 后台管理系统 - 增加,删除,修改

    效果: html代码: <%@ Page Language="C#" AutoEventWireup="true" CodeBehind="ad ...

  3. 服务确定撤销/删除/关闭 (ml81n)

    FUNCTION zrfc_mm006. *"---------------------------------------------------------------------- * ...

  4. Java面试题精选(一)基础概念和面向对象

    --   基础概念和面向对象   --      全程将为大家剖析几大部分内容,由于学习经验有限,望大家谅解并接受宝贵的意见: 基础概念部分     ★★   : 常出现的高频率单词的区别理解(异常. ...

  5. python学习教程(九)sqlalchemy框架的modern映射

    首先写一个modern.py文件, from sqlalchemy.ext.declarative import declarative_base from sqlalchemy import Col ...

  6. hive udaf 用maven打包运行create temporary function 时报错

    用maven打包写好的jar,在放到hive中作暂时函数时报错. 错误信息例如以下: hive> create temporary function maxvalue as "com. ...

  7. Oracle性能分析7:创建索引

    在创建索引时,我们往往希望可以预估索引大小,以评估对现有project环境的影响,我们也希望创建索引的过程可以最小化的影响我们正在执行的project环境,并能查看索引的状况. 预估索引大小 预估索引 ...

  8. html-图片button,抓包---Shinepans

    askLike.html <html> <meta http-equiv="content-type" content="text/html;chars ...

  9. PrimusUI

    小身材大用途,用PrimusUI驾驭你的页面 “PrimusUI”是自己在借鉴了如今网上很多开源的UI库,再经过自己整理加工的一个简单代码集合. 每个功能块的CSS代码都很少,力求简单易懂,低门槛,代 ...

  10. 自定义NavgationBa返回按钮

    iOS  上UINavigationController视图压栈形式,可以在当前视图无限制push许多视图,然而一些会觉得自带的push按钮不够美观,而且当上的上一个页面title很长的时候,那个返回 ...