[Hive - LanguageManual ] Windowing and Analytics Functions (待)
- Added by Lefty Leverenz, last edited by Lefty Leverenz on Aug 01, 2014 (view change)
- show comment
Windowing and Analytics Functions
- Windowing and Analytics Functions
- Enhancements to Hive QL
- Examples
- PARTITION BY with one partitioning column, no ORDER BY or window specification
- PARTITION BY with two partitioning columns, no ORDER BY or window specification
- PARTITION BY with one partitioning column, one ORDER BY column, and no window specification
- PARTITION BY with two partitioning columns, two ORDER BY columns, and no window specification
- PARTITION BY with partitioning, ORDER BY, and window specification
- WINDOW clause
- LEAD using default 1 row lead and not specifying default value
- LAG specifying a lag of 3 rows and default value of 0
Enhancements to Hive QL
Version
Icon
Introduced in Hive version 0.11.
This section introduces the Hive QL enhancements for windowing and analytics functions. See "Windowing Specifications in HQL" (attached to HIVE-4197) for details. HIVE-896 has more information, including links to earlier documentation in the initial comments.
All of the windowing and analytics functions operate as per the SQL standard.
The current release supports the following functions for windowing and analytics:
- Windowing functions
- LEAD
- The number of rows to lead can optionally be specified. If the number of rows to lead is not specified, the lead is one row.
- Returns null when the lead for the current row extends beyond the end of the window.
- LAG
- The number of rows to lag can optionally be specified. If the number of rows to lag is not specified, the lag is one row.
- Returns null when the lag for the current row extends before the beginning of the window.
- FIRST_VALUE
- LAST_VALUE
- LEAD
- The OVER clause
- OVER with standard aggregates:
- COUNT
- SUM
- MIN
- MAX
- AVG
- OVER with a PARTITION BY statement with one or more partitioning columns of any primitive datatype.
- OVER with PARTITION BY and ORDER BY with one or more partitioning and/or ordering columns of any datatype.
OVER with a window specification. Windows can be defined separately in a WINDOW clause. Window specifications support these standard options:
ROWS ((CURRENT ROW) | (UNBOUNDED | [num]) PRECEDING) AND (UNBOUNDED | [num]) FOLLOWING
IconThe OVER clause supports the following functions, but it does not support a window with them (see HIVE-4797):
Ranking functions: Rank, NTile, DenseRank, CumeDist, PercentRank.
Lead and Lag functions.
- OVER with standard aggregates:
- Analytics functions
- RANK
- ROW_NUMBER
- DENSE_RANK
- CUME_DIST
- PERCENT_RANK
- NTILE
Examples
This section provides examples of how to use the Hive QL windowing and analytics functions in SELECT statements. See HIVE-896 for additional examples.
PARTITION BY with one partitioning column, no ORDER BY or window specification
SELECT a, COUNT (b) OVER (PARTITION BY c) FROM T; |
PARTITION BY with two partitioning columns, no ORDER BY or window specification
SELECT a, COUNT (b) OVER (PARTITION BY c, d) FROM T; |
PARTITION BY with one partitioning column, one ORDER BY column, and no window specification
SELECT a, SUM (b) OVER (PARTITION BY c ORDER BY d) FROM T; |
PARTITION BY with two partitioning columns, two ORDER BY columns, and no window specification
SELECT a, SUM (b) OVER (PARTITION BY c, d ORDER BY e, f) FROM T; |
PARTITION BY with partitioning, ORDER BY, and window specification
SELECT a, SUM (b) OVER (PARTITION BY c ORDER BY d ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM T; |
SELECT a, AVG (b) OVER (PARTITION BY c ORDER BY d ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) FROM T; |
SELECT a, AVG (b) OVER (PARTITION BY c ORDER BY d ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING) FROM T; |
SELECT a, AVG (b) OVER (PARTITION BY c ORDER BY d ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) FROM T; |
There can be multiple OVER
clauses in a single query. A single OVER
clause only applies to the immediately preceding function call. In this example, the first OVER clause applies to COUNT(b) and the second OVER clause applies to SUM(b):
SELECT a, COUNT (b) OVER (PARTITION BY c), SUM (b) OVER (PARTITION BY c) FROM T; |
Aliases can be used as well, with or without the keyword AS:
SELECT a, COUNT (b) OVER (PARTITION BY c) AS b_count, SUM (b) OVER (PARTITION BY c) b_sum FROM T; |
WINDOW clause
SELECT a, SUM (b) OVER w FROM T; WINDOW w AS (PARTITION BY c ORDER BY d ROWS UNBOUNDED PRECEDING) |
LEAD using default 1 row lead and not specifying default value
SELECT a, LEAD(a) OVER (PARTITION BY b ORDER BY C ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING) FROM T; |
LAG specifying a lag of 3 rows and default value of 0
SELECT a, LAG(a, 3, 0) OVER (PARTITION BY b ORDER BY C ROWS 3 PRECEDING) FROM T; |
[Hive - LanguageManual ] Windowing and Analytics Functions (待)的更多相关文章
- [Hive - LanguageManual] Create/Drop/Alter -View、 Index 、 Function
Create/Drop/Alter View Create View Drop View Alter View Properties Alter View As Select Version info ...
- [HIve - LanguageManual] Hive Operators and User-Defined Functions (UDFs)
Hive Operators and User-Defined Functions (UDFs) Hive Operators and User-Defined Functions (UDFs) Bu ...
- [Hive - LanguageManual] Select base use
Select Syntax WHERE Clause ALL and DISTINCT Clauses Partition Based Queries HAVING Clause LIMIT Clau ...
- [Hive - LanguageManual ] ]SQL Standard Based Hive Authorization
Status of Hive Authorization before Hive 0.13 SQL Standards Based Hive Authorization (New in Hive 0. ...
- [HIve - LanguageManual] LateralView
Lateral View Syntax Description Example Multiple Lateral Views Outer Lateral Views Lateral View Synt ...
- [HIve - LanguageManual] XPathUDF
Documentation for Built-In User-Defined Functions Related To XPath UDFs xpath, xpath_short, xpath_in ...
- [Hive - LanguageManual] GroupBy
Group By Syntax Simple Examples Select statement and group by clause Advanced Features Multi-Group-B ...
- [Hive - LanguageManual] Import/Export
LanguageManual ImportExport Skip to end of metadata Added by Carl Steinbach, last edited by Le ...
- [Hive - LanguageManual] DML: Load, Insert, Update, Delete
LanguageManual DML Hive Data Manipulation Language Hive Data Manipulation Language Loading files int ...
随机推荐
- JavaScript DOM编程基础精华02(window对象的属性,事件中的this,动态创建DOM,innerText和innerHTML)
window对象的属性1 window.location对象: window.location.href=‘’;//重新导航到新页面,可以取值,也可以赋值. window.location.reloa ...
- IDEA15 File工具栏中没有 Import Project
使用IDEA准备导入项目时发现没有Import Project选项... 解决办法: Settings > Appearance & Bechavior > Menus and T ...
- IE6-IE11兼容性问题列表及解决办法
IE6-IE11兼容性问题列表及解决办法总结 相比IE6-IE9那版,主要添加IE10和IE11的新变化. 以下是目录及下载链接: 目录概述 2第一章:HTML 3第一节:IE7-IE8更新 3 1. ...
- distinct用法
distinct可以列出不重复的记录,对于单个字段来说distinct使用比较简单,但是对于多个字段来说,distinct使用起来会使人发狂.而且貌似也没有见到微软对distinct使用多字段的任何说 ...
- JAVA的核心概念:接口(interface)
JAVA的核心概念:接口(interface) 接口与类属于同一层次,实际上,接口是一种特殊的抽象类. 如: interface IA{ } public interface: 公开接口 与 ...
- 1033. Labyrinth(dfs)
1033 简单dfs 有一点小小的坑 就是图可能不连通 所以要从左上和右下都搜一下 加起来 从讨论里看到的 讨论里看到一句好无奈的回复 “可不可以用中文呀...” #include <iostr ...
- 真心崩溃了,oracle安装完成后居然没有tnsnames.ora和listener.ora文件
problem: oracle 11 r2 64位安装完成后NETWORK/ADMIN目录下居然没有tnsnames.ora和listener.ora文件 solution: 问题是之前安装了另 ...
- Instruments-查看收集到的数据
由于Xcode调试工具Instruments指南篇幅太长,所以本篇blog继续上篇,介绍对Instruments收集到的数据去分析. 关于数据分析 Instruments不解决你代码中的任何问题,它帮 ...
- C#中跨线程读取控件值、设置控件值
编写应用程序时,涉及到大量数据处理.串口通信.Socket通信等都会用到多线程,多线程中如何跨线程调用主界面或其他界面下的控件是一个问题,利用invoke和delegate可以解决. delegate ...
- UVa 11529 (计数) Strange Tax Calculation
枚举一个中心点,然后将其他点绕着这个点按照极角排序. 统计这个中心点在外面的三角形的个数,然后用C(n-1, 3)减去这个数就是包含这个点的三角形的数量. 然后再枚举一个起点L,终点为弧度小于π的点R ...