[HIve - LanguageManual] LateralView
Lateral View Syntax
lateralView: LATERAL VIEW udtf(expression) tableAlias AS columnAlias ( ',' columnAlias)* fromClause: FROM baseTable (lateralView)* |
Description
Lateral view is used in conjunction with user-defined table generating functions such as explode()
. As mentioned in Built-in Table-Generating Functions, a UDTF generates zero or more output rows for each input row. A lateral view first applies the UDTF to each row of base table and then joins resulting output rows to the input rows to form a virtual table having the supplied table alias.
Version
Icon
Prior to Hive 0.6.0, lateral view did not support the predicate push-down optimization. In Hive 0.5.0 and earlier, if you used a WHERE clause your query may not have compiled. A workaround was to add set hive.optimize.ppd=false;
before your query. The fix was made in Hive 0.6.0; seehttps://issues.apache.org/jira/browse/HIVE-1056: Predicate push down does not work with UDTF's.
Version
Icon
From Hive 0.12.0, column aliases can be omitted. In this case, aliases are inherited from field names of StructObjectInspector which is returned from UTDF.
Example
Consider the following base table named pageAds
. It has two columns: pageid
(name of the page) and adid_list
(an array of ads appearing on the page):
Column name |
Column type |
---|---|
pageid |
STRING |
adid_list |
Array<int> |
An example table with two rows:
pageid |
adid_list |
---|---|
front_page |
[1, 2, 3] |
contact_page |
[3, 4, 5] |
and the user would like to count the total number of times an ad appears across all pages.
A lateral view with explode() can be used to convert adid_list
into separate rows using the query:
SELECT pageid, adid FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid; |
The resulting output will be
pageid (string) |
adid (int) |
---|---|
"front_page" |
1 |
"front_page" |
2 |
"front_page" |
3 |
"contact_page" |
3 |
"contact_page" |
4 |
"contact_page" |
5 |
Then in order to count the number of times a particular ad appears, count/group by can be used:
SELECT adid, count (1) FROM pageAds LATERAL VIEW explode(adid_list) adTable AS adid GROUP BY adid; |
int adid |
count(1) |
1 |
1 |
2 |
1 |
3 |
2 |
4 |
1 |
5 |
1 |
Multiple Lateral Views
A FROM clause can have multiple LATERAL VIEW clauses. Subsequent LATERAL VIEWS can reference columns from any of the tables appearing to the left of the LATERAL VIEW.
For example, the following could be a valid query:
SELECT * FROM exampleTable LATERAL VIEW explode(col1) myTable1 AS myCol1 LATERAL VIEW explode(myCol1) myTable2 AS myCol2; |
LATERAL VIEW clauses are applied in the order that they appear. For example with the following base table:
Array<int> col1 |
Array<string> col2 |
[1, 2] |
[a", "b", "c"] |
[3, 4] |
[d", "e", "f"] |
The query:
SELECT myCol1, col2 FROM baseTable LATERAL VIEW explode(col1) myTable1 AS myCol1; |
Will produce:
int mycol1 |
Array<string> col2 |
1 |
[a", "b", "c"] |
2 |
[a", "b", "c"] |
3 |
[d", "e", "f"] |
4 |
[d", "e", "f"] |
A query that adds an additional LATERAL VIEW:
SELECT myCol1, myCol2 FROM baseTable LATERAL VIEW explode(col1) myTable1 AS myCol1 LATERAL VIEW explode(col2) myTable2 AS myCol2; |
Will produce:
int myCol1 |
string myCol2 |
1 |
"a" |
1 |
"b" |
1 |
"c" |
2 |
"a" |
2 |
"b" |
2 |
"c" |
3 |
"d" |
3 |
"e" |
3 |
"f" |
4 |
"d" |
4 |
"e" |
4 |
"f" |
Outer Lateral Views
Version
Icon
Introduced in Hive version 0.12.0
The user can specify the optional OUTER
keyword to generate rows even when a LATERAL VIEW
usually would not generate a row. This happens when the UDTF used does not generate any rows which happens easily with explode
when the column to explode is empty. In this case the source row would never appear in the results. OUTER
can be used to prevent that and rows will be generated with NULL
values in the columns coming from the UDTF.
For example, the following query returns an empty result:
SELEC * FROM src LATERAL VIEW explode(array()) C AS a limit 10; |
But with the OUTER
keyword
SELECT * FROM src LATERAL VIEW OUTER explode(array()) C AS a limit 10; |
it will produce:
238 val_238 NULL
86 val_86 NULL
311 val_311 NULL
27 val_27 NULL
165 val_165 NULL
409 val_409 NULL
255 val_255 NULL
278 val_278 NULL
98 val_98 NULL
[HIve - LanguageManual] LateralView的更多相关文章
- [HIve - LanguageManual] Hive Operators and User-Defined Functions (UDFs)
Hive Operators and User-Defined Functions (UDFs) Hive Operators and User-Defined Functions (UDFs) Bu ...
- [Hive - LanguageManual ] Windowing and Analytics Functions (待)
LanguageManual WindowingAndAnalytics Skip to end of metadata Added by Lefty Leverenz, last edi ...
- [Hive - LanguageManual] Import/Export
LanguageManual ImportExport Skip to end of metadata Added by Carl Steinbach, last edited by Le ...
- [Hive - LanguageManual] DML: Load, Insert, Update, Delete
LanguageManual DML Hive Data Manipulation Language Hive Data Manipulation Language Loading files int ...
- [Hive - LanguageManual] Alter Table/Partition/Column
Alter Table/Partition/Column Alter Table Rename Table Alter Table Properties Alter Table Comment Add ...
- Hive LanguageManual DDL
hive语法规则LanguageManual DDL SQL DML 和 DDL 数据操作语言 (DML) 和 数据定义语言 (DDL) 一.数据库 增删改都在文档里说得也很明白,不重复造车轮 二.表 ...
- [Hive - LanguageManual ] ]SQL Standard Based Hive Authorization
Status of Hive Authorization before Hive 0.13 SQL Standards Based Hive Authorization (New in Hive 0. ...
- [Hive - LanguageManual] Hive Concurrency Model (待)
Hive Concurrency Model Hive Concurrency Model Use Cases Turn Off Concurrency Debugging Configuration ...
- [Hive - LanguageManual ] Explain (待)
EXPLAIN Syntax EXPLAIN Syntax Hive provides an EXPLAIN command that shows the execution plan for a q ...
随机推荐
- brew命令
下面参考下网友的总结: 查看brew的帮助 brew –help 安装软件 brew install git 卸载软件 brew uninstall git 搜索软件 brew search git ...
- Git教程之创建版本库(2)
什么是版本库呢?版本库又名仓库,英文名repository,你可以简单理解成一个目录,这个目录里面的所有文件都可以被Git管理起来,每个文件的修改.删除,Git都能跟踪,以便任何时刻都可以追踪历史,或 ...
- 261. Graph Valid Tree
题目: Given n nodes labeled from 0 to n - 1 and a list of undirected edges (each edge is a pair of nod ...
- Android LayoutInflater.inflate()的参数及其用法
很多人在网上问LayoutInflater类的用法,以及inflate()方法参数的含义,现解释如下: inflate()的作用就是将一个用xml定义的布局文件查找出来,注意与findViewById ...
- word文档标题级别批量更改——批量降级与升级实例
word文档标题级别批量更改——批量降级与升级实例 word文档标题级别批量更改——批量降级实例 2012年12月21日16:30:44 现有一个3级文档结构的word文档,如下图所示 先需要将上 ...
- 单交换机VLAN虚拟局域网划分
1.下载Cisco模拟器 Packet Tracer 是由Cisco公司发布的一个辅助学习工具,为学习CCNA课程的网络初学者去设计.配置.排除网络故障提供了网络模拟环境.学生可在软件的图形用户界面上 ...
- HBase学习笔记
关键类: HBaseAdmin 管理Hbase的,主要负责DDL操作 HTable 管理表中数据,主要负责DML操作 1.为了避免热点,更多的建表方法 在Shell中: },{SPLITS=>[ ...
- Android开发之获取系统版本号
获取系统版本号:获取当前系统的版本号: textView.setText("Product Model: " + android.os.Build.MODEL + ",& ...
- [POJ1236]Network of Schools(并查集+floyd,伪强连通分量)
题目链接:http://poj.org/problem?id=1236 这题本来是个强连通分量板子题的,然而弱很久不写tarjan所以生疏了一下,又看这数据范围觉得缩点这个事情可以用点到点之间的距离来 ...
- SecureCRT访问开发板linux系统
前言: 最近在用OK6410开发板跑linux系统,经常在终端上敲一些指令,无奈开发板屏幕太小用起来非常不方便,所以使用终端一款能运行在windows上的软件与开发板连接,直接在电脑上操作开发板了,这 ...