LanguageManual UDF


UDF:User defined function 用户定义函数
UDAF:User defined aggregation function
如:max min count
UDTF:User definesd table-Generating Function
如:lateral view explore




2.First, you need to create a new class that extends UDF, with one or more methods named evaluate.

创建一个类继承UDF类,实现 evaluate 方法

package com.cenzhongman.hive.udf;

import org.apache.hadoop.hive.ql.exec.UDF;
import; public class LowerUDF extends UDF{ //•Implement one or more methods named evaluate which will be called by Hive (the exact way in which Hive resolves the method to call can be configured by setting a custom UDFMethodResolver). The following are some examples: ◦public int evaluate();
// ◦public int evaluate(int a);
// ◦public double evaluate(int a, double b);
// ◦public String evaluate(String a, int b, Text c);
// ◦public Text evaluate(String a);
// ◦public String evaluate(List<Integer> a); (Note that Hive Arrays are represented as Lists in Hive. So an ARRAY<int> column would be passed in as a List<Integer>.)
// •evaluate should never be a void method. However it can return null if needed. 不允许返回类型为 void 可以返回 null
// •Return types as well as method arguments can be either Java primitives or the corresponding Writable class.
// !!推荐参数使用mapReduce 的类型 public Text evaluate(Text str) {
//void data
if(str.toString() == null) {
return null;
return new Text(str.toString().toLowerCase());
} //用于测试,Hive 的入口函数是 evaluate 所以没有影响
public static void main(String[] args) {
System.out.println(new LowerUDF().evaluate(new Text("Hive")));

3.在 Hive 中使用自定义函数

# 添加 jar 到资源库中
add jar /opt/datas/filename.jar # 创建临时函数
create temporary function my_lower as "com.cenzhongman.hive.udf.LowerUDF"; # 查看函数,确认添加成功
show functions; # 使用函数
select my_lower(job) Upper_job from emp;

As of Hive 0.13, UDFs also have the option of being able to specify required jars in the CREATE FUNCTION statement:


CREATE FUNCTION myfunc AS 'myclass' USING JAR 'hdfs:///path/to/jar';

