[Hive] - Hive参数含义详解

　　hive中参数分为三类，第一种system环境变量信息，是系统环境变量信息；第二种是env环境变量信息，是当前用户环境变量信息；第三种是hive参数变量信息，是由hive-site.xml文件定义的以及当前hive会话定义的环境变量信息。其中第三种hive参数变量信息中又由hadoop hdfs参数(直接是hadoop的)、mapreduce参数、metastore元数据存储参数、metastore连接参数以及hive运行参数构成。

Hive-0.13.1-cdh5.3.6参数变量信息详解
参数	默认值	含义(用处)
datanucleus.autoCreateSchema	true	creates necessary schema on a startup if one doesn't exist. set this to false, after creating it once；如果数据元数据不存在，那么直接创建，如果设置为false，那么在之后创建。
datanucleus.autoStartMechanismMode	checked	throw exception if metadata tables are incorrect;如果数据元信息检查失败，抛出异常。可选value: checked, unchecked
datanucleus.cache.level2	false	Use a level 2 cache. Turn this off if metadata is changed independently of Hive metastore server; 是否使用二级缓存机制。
datanucleus.cache.level2.type	SOFT	SOFT=soft reference based cache, WEAK=weak reference based cache， none=no cache.二级缓存机制的类型，none是不使用，SOFT表示使用软引用，WEAK表示使用弱引用。
datanucleus.connectionPoolingType	BoneCP	metastore数据连接池使用。
datanucleus.fixedDatastore	false
datanucleus.identifierFactory	datanucleus1	Name of the identifier factory to use when generating table/column names etc.创建metastore数据库的工厂类。
datanucleus.plugin.pluginRegistryBundleCheck	LOG	Defines what happens when plugin bundles are found and are duplicated [EXCEPTION\|LOG\|NONE]
datanucleus.rdbms.useLegacyNativeValueStrategy	true
datanucleus.storeManagerType	rdbms	元数据存储方式
datanucleus.transactionIsolation	read-committed	事务机制，Default transaction isolation level for identity generation.
datanucleus.validateColumns	false	validates existing schema against code. turn this on if you want to verify existing schema,对于存在的表是否进行检查schema
datanucleus.validateConstraints	false	对于存在的表是否检查约束
datanucleus.validateTables	false	检查表
dfs.block.access.key.update.interval	600
hive.archive.enabled	false	Whether archiving operations are permitted；是否允许进行归档操作。
hive.auto.convert.join	true	Whether Hive enables the optimization about converting common join into mapjoin based on the input file size；是否允许进行data join 优化
hive.auto.convert.join.noconditionaltask	true	Whether Hive enables the optimization about converting common join into mapjoin based on the input file size. If this parameter is on, and the sum of size for n-1 of the tables/partitions for a n-way join is smaller than the specified size, the join is directly converted to a mapjoin (there is no conditional task).针对没有条件的task，是否直接使用data join。
hive.auto.convert.join.noconditionaltask.size	10000000	If hive.auto.convert.join.noconditionaltask is off, this parameter does not take affect. However, if it is on, and the sum of size for n-1 of the tables/partitions for a n-way join is smaller than this size, the join is directly converted to a mapjoin(there is no conditional task). The default is 10MB；如果${hive.auto.convert.join.noconditionaltask}设置为true，那么表示控制文件的大小值，默认10M；也就是说如果小于10M，那么直接使用data join。
hive.auto.convert.join.use.nonstaged	false	For conditional joins, if input stream from a small alias can be directly applied to join operator without filtering or projection, the alias need not to be pre-staged in distributed cache via mapred local task. Currently, this is not working with vectorization or tez execution engine.对于有条件的数据join，对于小文件是否使用分布式缓存。
hive.auto.convert.sortmerge.join	false	Will the join be automatically converted to a sort-merge join, if the joined tables pass the criteria for sort-merge join.如果可以转换，自动转换为标准的sort-merge join方式。
hive.auto.convert.sortmerge.join.bigtable.selection.policy	org.apache.hadoop.hive.ql.optimizer.AvgPartitionSizeBasedBigTableSelectorForAutoSMJ
hive.auto.convert.sortmerge.join.to.mapjoin	false	是否穿件sort-merge join到map join方式
hive.auto.progress.timeout	0	How long to run autoprogressor for the script/UDTF operators (in seconds). Set to 0 for forever. 执行脚本和udtf过期时间，设置为0表示永不过期。
hive.autogen.columnalias.prefix.includefuncname	false	hive自动产生的临时列名是否加function名称，默认不加
hive.autogen.columnalias.prefix.label	_c	hive的临时列名主体部分
hive.binary.record.max.length	1000	hive二进制记录最长长度
hive.cache.expr.evaluation	true	If true, evaluation result of deterministic expression referenced twice or more will be cached. For example, in filter condition like ".. where key + 10 > 10 or key + 10 = 0" "key + 10" will be evaluated/cached once and reused for following expression ("key + 10 = 0"). Currently, this is applied only to expressions in select or filter operator. 是否允许缓存表达式的执行，默认允许；先阶段只缓存select和where中的表达式结果。
hive.cli.errors.ignore	false
hive.cli.pretty.output.num.cols	-1
hive.cli.print.current.db	false	是否显示当前操作database名称，默认不显示
hive.cli.print.header	false	是否显示具体的查询头部信息，默认不显示。比如不显示列名。
hive.cli.prompt	hive	hive的前缀提示信息,，修改后需要重新启动客户端。
hive.cluster.delegation.token.store.class	org.apache.hadoop.hive.thrift.MemoryTokenStore	hive集群委托token信息存储类
hive.cluster.delegation.token.store.zookeeper.znode	/hive/cluster/delegation	hive zk存储
hive.compactor.abortedtxn.threshold	1000	分区压缩文件阀值
hive.compactor.check.interval	300	压缩间隔时间，单位秒
hive.compactor.delta.num.threshold	10	子分区阀值
hive.compactor.delta.pct.threshold	0.1	压缩比例
hive.compactor.initiator.on	false
hive.compactor.worker.threads	0
hive.compactor.worker.timeout	86400	单位秒
hive.compat	0.12	兼容版本信息
hive.compute.query.using.stats	false
hive.compute.splits.in.am	true
hive.conf.restricted.list	hive.security.authenticator.manager,hive.security.authorization.manager
hive.conf.validation	true
hive.convert.join.bucket.mapjoin.tez	false
hive.counters.group.name	HIVE
hive.debug.localtask	false
hive.decode.partition.name	false
hive.default.fileformat	TextFile	指定默认的fileformat格式化器。默认为textfile。
hive.default.rcfile.serde	org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe	rcfile对应的序列化类
hive.default.serde	org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe	默认的序列化类
hive.display.partition.cols.separately	true	hive分区单独的显示列名
hive.downloaded.resources.dir	/tmp/${hive.session.id}_resources	hive下载资源存储文件
hive.enforce.bucketing	false	是否允许使用桶
hive.enforce.bucketmapjoin	false	是否允许桶进行map join
hive.enforce.sorting	false	是否允许在插入的时候使用sort排序。
hive.enforce.sortmergebucketmapjoin	false
hive.entity.capture.transform	false
hive.entity.separator	@	Separator used to construct names of tables and partitions. For example, dbname@tablename@partitionname
hive.error.on.empty.partition	false	Whether to throw an exception if dynamic partition insert generates empty results.当启用动态hive的时候，如果插入的partition为空，是否抛出异常信息。
hive.exec.check.crossproducts	true	检查是否包含向量积
hive.exec.compress.intermediate	false	中间结果是否压缩，压缩机制采用hadoop的配置信息mapred.output.compress*
hive.exec.compress.output	false	最终结果是否压缩
hive.exec.concatenate.check.index	true
hive.exec.copyfile.maxsize	33554432
hive.exec.counters.pull.interval	1000
hive.exec.default.partition.name	__HIVE_DEFAULT_PARTITION__
hive.exec.drop.ignorenonexistent	true	当执行删除的时候是否忽略不存在的异常信息，默认忽略，如果忽略，那么会报错。
hive.exec.dynamic.partition	true	是否允许动态指定partition，如果允许的话，那么我们修改内容的时候可以不指定partition的值。
hive.exec.dynamic.partition.mode	strict	动态partition模式，strict模式要求至少给定一个静态的partition值。nonstrict允许全部partition为动态的值。
hive.exec.infer.bucket.sort	false
hive.exec.infer.bucket.sort.num.buckets.power.two	false
hive.exec.job.debug.capture.stacktraces	true
hive.exec.job.debug.timeout	30000
hive.exec.local.scratchdir	/tmp/hadoop
hive.exec.max.created.files	100000	在mr程序中最大创建的hdfs文件个数
hive.exec.max.dynamic.partitions	1000	动态分区的总的分区最大个数
hive.exec.max.dynamic.partitions.pernode	100	每个MR节点的最大创建个数
hive.exec.mode.local.auto	false	是否允许hive运行本地模式
hive.exec.mode.local.auto.input.files.max	4	hive本地模式最大输入文件数量
hive.exec.mode.local.auto.inputbytes.max	134217728	hive本地模式组大输入字节数
hive.exec.orc.default.block.padding	true
hive.exec.orc.default.buffer.size	262144
hive.exec.orc.default.compress	ZLIB
hive.exec.orc.default.row.index.stride	10000
hive.exec.orc.default.stripe.size	268435456
hive.exec.orc.dictionary.key.size.threshold	0.8
hive.exec.orc.memory.pool	0.5
hive.exec.orc.skip.corrupt.data	false
hive.exec.orc.zerocopy	false
hive.exec.parallel	false	是否允许并行执行，默认不允许。
hive.exec.parallel.thread.number	8	并行执行线程个数，默认8个。
hive.exec.perf.logger	org.apache.hadoop.hive.ql.log.PerfLogger
hive.exec.rcfile.use.explicit.header	true
hive.exec.rcfile.use.sync.cache	true
hive.exec.reducers.bytes.per.reducer	1000000000	size per reducer.The default is 1G, i.e if the input size is 10G, it will use 10 reducers. 默认reducer节点处理数据的规模，默认1G。
hive.exec.reducers.max	999	reducer允许的最大个数。当mapred.reduce.tasks指定为负值的时候，该参数起效。
hive.exec.rowoffset	false
hive.exec.scratchdir	/etc/hive-hadoop
hive.exec.script.allow.partial.consumption	false
hive.exec.script.maxerrsize	100000
hive.exec.script.trust	false
hive.exec.show.job.failure.debug.info	true
hive.exec.stagingdir	.hive-staging
hive.exec.submitviachild	false
hive.exec.tasklog.debug.timeou	20000
hive.execution.engine	mr	执行引擎mr或者Tez(hadoop2)
hive.exim.uri.scheme.whitelist	hdfs,pfile
hive.explain.dependency.append.tasktype	false
hive.fetch.output.serde	org.apache.hadoop.hive.serde2.DelimitedJSONSerDe
hive.fetch.task.aggr	false
hive.fetch.task.conversion	minimal
hive.fetch.task.conversion.threshold	-1
hive.file.max.footer	100
hive.fileformat.check	true
hive.groupby.mapaggr.checkinterval	100000
hive.groupby.orderby.position.alias	false
hive.groupby.skewindata	false
hive.hadoop.supports.splittable.combineinputformat	false
hive.hashtable.initialCapacity	100000
hive.hashtable.loadfactor	0.75
hive.hbase.generatehfiles	false
hive.hbase.snapshot.restoredir	/tmp
hive.hbase.wal.enabled	true
hive.heartbeat.interval	1000
hive.hmshandler.force.reload.conf	false
hive.hmshandler.retry.attempts	1
hive.hmshandler.retry.interval	1000
hive.hwi.listen.host	0.0.0.0
hive.hwi.listen.port	9999
hive.hwi.war.file	lib/hive-hwi-${version}.war
hive.ignore.mapjoin.hint	true
hive.in.test	false
hive.index.compact.binary.search	true
hive.index.compact.file.ignore.hdfs	false
hive.index.compact.query.max.entries	10000000
hive.index.compact.query.max.size	10737418240
hive.input.format	org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
hive.insert.into.external.tables	true
hive.insert.into.multilevel.dirs	false
hive.jobname.length	50
hive.join.cache.size	25000
hive.join.emit.interval	1000
hive.lazysimple.extended_boolean_literal	false
hive.limit.optimize.enable	false
hive.limit.optimize.fetch.max	50000
hive.limit.optimize.limit.file	10
hive.limit.pushdown.memory.usage	-1.0
hive.limit.query.max.table.partition	-1
hive.limit.row.max.size	100000
hive.localize.resource.num.wait.attempts	5
hive.localize.resource.wait.interval	5000
hive.lock.manager	org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
hive.mapred.partitioner	org.apache.hadoop.hive.ql.io.DefaultHivePartitioner
hive.mapred.reduce.tasks.speculative.execution	true
hive.mapred.supports.subdirectories	false
hive.metastore.uris	thrift://hh:9083
hive.metastore.warehouse.dir	/user/hive/warehouse
hive.multi.insert.move.tasks.share.dependencies	false
hive.multigroupby.singlereducer	true
hive.zookeeper.clean.extra.nodes	false	在会话结束的时候是否清楚额外的节点数据
hive.zookeeper.client.port	2181	客户端端口号
hive.zookeeper.quorum		zk的服务器端ip
hive.zookeeper.session.timeout	600000	zk的client端会话过期时间
hive.zookeeper.namespace	hive_zookeeper_namespace
javax.jdo.PersistenceManagerFactoryClass	org.datanucleus.api.jdo.JDOPersistenceManagerFactory
javax.jdo.option.ConnectionDriverName	改为：com.mysql.jdbc.Driver
javax.jdo.option.ConnectionPassword	改为：hive
javax.jdo.option.ConnectionURL	xxx
javax.jdo.option.ConnectionUserName	xxx
javax.jdo.option.DetachAllOnCommit	true
javax.jdo.option.Multithreaded	true
javax.jdo.option.NonTransactionalRead	true

[Hive] - Hive参数含义详解的更多相关文章

机器学习——随机森林，RandomForestClassifier参数含义详解
1.随机森林模型 clf = RandomForestClassifier(n_estimators=200, criterion='entropy', max_depth=4) rf_clf = c ...
Hive配置项的含义详解
关于MetaStore:metastore是个独立的关系数据库,用来持久化schema和系统元数据. hive.metastore.local:控制hive是否连接一个远程metastore服务器还是 ...
Apache的配置文件http.conf参数含义详解
Apache的配置由httpd.conf文件配置,因此下面的配置指令都是在httpd.conf文件中修改. 主站点的配置(基本配置) (1) 基本配置: ServerRoot "/mnt/s ...
大数据学习系列之五 ----- Hive整合HBase图文详解
引言在上一篇大数据学习系列之四 ----- Hadoop+Hive环境搭建图文详解(单机) 和之前的大数据学习系列之二 ----- HBase环境搭建(单机) 中成功搭建了Hive和HBase的环 ...
Hive 的collect_set使用详解
Hive 的collect_set使用详解 https://blog.csdn.net/liyantianmin/article/details/48262109 对于非group by字段,可以 ...
MySQL高可用架构之Mycat-关于Mycat安装和参数设置详解
MySQL高可用架构之Mycat-关于Mycat安装和参数设置详解作者:尹正杰版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.Mycat介绍 1>.什么是Mycat Mycat背后是 ...
Oracle Statspack报告中各项指标含义详解~~学习性能必看！！！
Oracle Statspack报告中各项指标含义详解~~学习性能必看!!! Data Buffer Hit Ratio#<#90# 数据块在数据缓冲区中的命中率,通常应该在90%以上,否则考虑 ...
Spring boot注解(annotation)含义详解
Spring boot注解(annotation)含义详解 @Service用于标注业务层组件@Controller用于标注控制层组件(如struts中的action)@Repository用于标注数 ...
Linux命令 ls -l 输出内容含义详解
Linux命令 ls -l s输出内容含义详解 1. ls 只显示文件名或者文件目录 2. ls -l(这个参数是字母L的小写,不是数字1) 用来查看详细的文件资料在某个目录下键入ls -l可 ...

随机推荐

(C#)利用Aspose.Cells组件导入导出excel文件
Aspose.Cells组件可以不依赖excel来导入导出excel文件: 导入: public static System.Data.DataTable ReadExcel(String strFi ...
iOS开发：创建真机调试证书分类： ios相关 2015-04-10 10:22 149人阅读评论(0) 收藏
关于苹果iOS开发,笔者也是从小白过来的,经历过各种困难和坑,其中就有关于开发证书,生产证书,in_house证书,add_Hoc证书申请过程中的问题,以及上架发布问题.今天就着重说一下关于针对于苹果 ...
《Web接口开发与自动化测试 -- 基于Python语言》 ---前言
前言本书的原型是我整理一份Django学习文档,从事软件测试工作的这六.七年来,一直有整理学习资料的习惯,这种学习理解再输出的方式对我非常受用,博客和文档是我主要的输出形式,这些输出同时也帮 ...
【Xilinx-Petalinux学习】-02-建立PetaLinux工程
前面我已经把PetaLinux成功安装到了Ubuntu虚拟机当中了,接下来就要实际操作,将PetaLinux移植到我们自己的硬件平台当中去. step1:硬件描述文件有两种PetaLinux工程建立 ...
Git和Github的配合使用
Git教程 http://www.liaoxuefeng.com/wiki/0013739516305929606dd18361248578c67b8067c8c017b000 Git 本地仓库详解 ...
文件查找和比较命令来自: http://man.linuxde.net/find
文件查找和比较1.find命令,用来在指定目录下查找文件.任何位于参数之前的字符串都将被视为欲查找的目录名.如果使用该命令时不设置任何参数,则find命令则在当前目录下查找子目录与文件.并且将查到的子 ...
.net学习路线
http://www.cnblogs.com/huangmeimujin/archive/2011/08/08/2131242.html http://jingyan.baidu.com/articl ...
笔记整理——Linux下C语言正则表达式
Linux下C语言正则表达式使用详解 - Google Chrome (2013/5/2 16:40:37) Linux下C语言正则表达式使用详解 2012年6月6日Neal627 views发表评论 ...
Angular - - ngReadonly、ngSelected、ngDisabled
ngReadonly 该指令将input,textarea等文本输入设置为只读. HTML规范不允许浏览器保存类似readonly的布尔值属性.如果我们将一个Angular的插入值表达式转换为这样的属 ...
【BZOJ 3926】【ZJOI 2015】诸神眷顾的幻想乡
http://www.lydsy.com/JudgeOnline/problem.php?id=3926 广义后缀自动机的例题,感觉广义后缀自动机好恶心... 广义后缀自动机是对一个trie建立的后缀 ...

[Hive] - Hive参数含义详解

[Hive] - Hive参数含义详解的更多相关文章

随机推荐

热门专题