[HIve - LanguageManual] LateralView
Lateral View Syntax
lateralView: LATERAL VIEW udtf(expression) tableAlias AS columnAlias ( ',' columnAlias)* fromClause: FROM baseTable (lateralView)* |
Description
Lateral view is used in conjunction with user-defined table generating functions such as explode()
. As mentioned in Built-in Table-Generating Functions, a UDTF generates zero or more output rows for each input row. A lateral view first applies the UDTF to each row of base table and then joins resulting output rows to the input rows to form a virtual table having the supplied table alias.
Version
Icon
Prior to Hive 0.6.0, lateral view did not support the predicate push-down optimization. In Hive 0.5.0 and earlier, if you used a WHERE clause your query may not have compiled. A workaround was to add set hive.optimize.ppd=false;
before your query. The fix was made in Hive 0.6.0; seehttps://issues.apache.org/jira/browse/HIVE-1056: Predicate push down does not work with UDTF's.
Version
Icon
From Hive 0.12.0, column aliases can be omitted. In this case, aliases are inherited from field names of StructObjectInspector which is returned from UTDF.
Example
Consider the following base table named pageAds
. It has two columns: pageid
(name of the page) and adid_list
(an array of ads appearing on the page):
Column name |
Column type |
---|---|
pageid |
STRING |
adid_list |
Array<int> |
An example table with two rows:
pageid |
adid_list |
---|---|
front_page |
[1, 2, 3] |
contact_page |
[3, 4, 5] |
and the user would like to count the total number of times an ad appears across all pages.
A lateral view with explode() can be used to convert adid_list
into separate rows using the query:
SELECT pageid, adid FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid; |
The resulting output will be
pageid (string) |
adid (int) |
---|---|
"front_page" |
1 |
"front_page" |
2 |
"front_page" |
3 |
"contact_page" |
3 |
"contact_page" |
4 |
"contact_page" |
5 |
Then in order to count the number of times a particular ad appears, count/group by can be used:
SELECT adid, count (1) FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid GROUP BY adid; |
int adid |
count(1) |
1 |
1 |
2 |
1 |
3 |
2 |
4 |
1 |
5 |
1 |
Multiple Lateral Views
A FROM clause can have multiple LATERAL VIEW clauses. Subsequent LATERAL VIEWS can reference columns from any of the tables appearing to the left of the LATERAL VIEW.
For example, the following could be a valid query:
SELECT * FROM exampleTable LATERAL VIEW explode(col1) myTable1 AS myCol1 LATERAL VIEW explode(myCol1) myTable2 AS myCol2; |
LATERAL VIEW clauses are applied in the order that they appear. For example with the following base table:
Array<int> col1 |
Array<string> col2 |
[1, 2] |
[a", "b", "c"] |
[3, 4] |
[d", "e", "f"] |
The query:
SELECT myCol1, col2 FROM baseTable LATERAL VIEW explode(col1) myTable1 AS myCol1; |
Will produce:
int mycol1 |
Array<string> col2 |
1 |
[a", "b", "c"] |
2 |
[a", "b", "c"] |
3 |
[d", "e", "f"] |
4 |
[d", "e", "f"] |
A query that adds an additional LATERAL VIEW:
SELECT myCol1, myCol2 FROM baseTable LATERAL VIEW explode(col1) myTable1 AS myCol1 LATERAL VIEW explode(col2) myTable2 AS myCol2; |
Will produce:
int myCol1 |
string myCol2 |
1 |
"a" |
1 |
"b" |
1 |
"c" |
2 |
"a" |
2 |
"b" |
2 |
"c" |
3 |
"d" |
3 |
"e" |
3 |
"f" |
4 |
"d" |
4 |
"e" |
4 |
"f" |
Outer Lateral Views
Version
Icon
Introduced in Hive version 0.12.0
The user can specify the optional OUTER
keyword to generate rows even when a LATERAL VIEW
usually would not generate a row. This happens when the UDTF used does not generate any rows which happens easily with explode
when the column to explode is empty. In this case the source row would never appear in the results. OUTER
can be used to prevent that and rows will be generated with NULL
values in the columns coming from the UDTF.
For example, the following query returns an empty result:
SELEC * FROM src LATERAL VIEW explode(array()) C AS a limit 10; |
But with the OUTER
keyword
SELECT * FROM src LATERAL VIEW OUTER explode(array()) C AS a limit 10; |
it will produce:
238 val_238 NULL
86 val_86 NULL
311 val_311 NULL
27 val_27 NULL
165 val_165 NULL
409 val_409 NULL
255 val_255 NULL
278 val_278 NULL
98 val_98 NULL
[HIve - LanguageManual] LateralView的更多相关文章
- [HIve - LanguageManual] Hive Operators and User-Defined Functions (UDFs)
Hive Operators and User-Defined Functions (UDFs) Hive Operators and User-Defined Functions (UDFs) Bu ...
- [Hive - LanguageManual ] Windowing and Analytics Functions (待)
LanguageManual WindowingAndAnalytics Skip to end of metadata Added by Lefty Leverenz, last edi ...
- [Hive - LanguageManual] Import/Export
LanguageManual ImportExport Skip to end of metadata Added by Carl Steinbach, last edited by Le ...
- [Hive - LanguageManual] DML: Load, Insert, Update, Delete
LanguageManual DML Hive Data Manipulation Language Hive Data Manipulation Language Loading files int ...
- [Hive - LanguageManual] Alter Table/Partition/Column
Alter Table/Partition/Column Alter Table Rename Table Alter Table Properties Alter Table Comment Add ...
- Hive LanguageManual DDL
hive语法规则LanguageManual DDL SQL DML 和 DDL 数据操作语言 (DML) 和 数据定义语言 (DDL) 一.数据库 增删改都在文档里说得也很明白,不重复造车轮 二.表 ...
- [Hive - LanguageManual ] ]SQL Standard Based Hive Authorization
Status of Hive Authorization before Hive 0.13 SQL Standards Based Hive Authorization (New in Hive 0. ...
- [Hive - LanguageManual] Hive Concurrency Model (待)
Hive Concurrency Model Hive Concurrency Model Use Cases Turn Off Concurrency Debugging Configuration ...
- [Hive - LanguageManual ] Explain (待)
EXPLAIN Syntax EXPLAIN Syntax Hive provides an EXPLAIN command that shows the execution plan for a q ...
随机推荐
- NDK(18)使用C++ STL
1,在Application.mk 中使用 APP_STL := stlport_static 等. APP_ABI := x86 armeabi APP_PLATFORM := android-15 ...
- windows2003 iis6.0站点打不开,找不到服务器或 DNS 错误。
最近服务器经常出现打不开网站的现象,有时出现在上午,有时出现在中午,几乎天天都会出现一次,出现问题时,无论是回收程序池还是重启IIS或者关闭其它一些可能有影响的服务,都不能解决问题.网站打不开时,有如 ...
- facebook海量图片存储系统与淘宝TFS系统比较
本篇论文的原文可谓通俗易懂.行云流水.结构清晰.图文并茂……正如作者所说的——"替换Facebook的图片存储系统就像高速公路上给汽车换轮子,我们无法去追求完美的设计……我们花费了很多的注意 ...
- codevs 1135 选择客栈
这题没什么话说. #include<iostream> #include<cstdio> #include<cstring> #include<algorit ...
- POJ 1523 SPF (割点,连通分量)
题意:给出一个网络(不一定连通),求所有的割点,以及割点可以切分出多少个连通分量. 思路:很多种情况. (1)如果给的图已经不是连通图,直接“ No SPF nodes”. (2)求所有割点应该不难 ...
- Vagrant搭建Ubuntu-JavaEE开发环境——Tomcat+JDK+MySQL+dubbo+测试
Vagrant搭建(Tomcat8+JDK7+MySQL5+dubbo) JDK 1.下载jdk 2.解压JDK tar -xzvf jdk-7u79-linux-x64.tar.gz 3.设置环境变 ...
- TCP/IP详解学习笔记(14)-TCP连接的未来和性能(未写完)
在TCP刚出世的时候,其主要工作环境是以太网和SLIP之类的低速网络.随着高速网络的出现,让TCP协议的吞吐量更大,效率更高的要求就愈来愈迫切.为此,TCP增加了三个重要机制来对应现在的变化,他们是 ...
- 国外主流PHP框架比较
最近简单的使用了目前在国内用的比较多的几个主流国外PHP框架(不包括国内框架),大致对这些框架有个直观上的感受,简单分享一下,对于哪些做框架选型的时候,权当一个参考. 主要参考的框架包括:CodeIg ...
- MyBatis一对多双向关联——MyBatis学习笔记之七
处理has-one关系需要用到association元素,而处理has many关系则需要用到collection元素.例如本例中,假设一 名教师可同时指导多名学生,下面就来介绍如何使用collect ...
- 改变DEV控件的字体 z
改变所有的组件字体,使用AppearanceObject.DefaultFont属性:static void Main() { DevExpress.Utils.AppearanceObject ...