hive 窗口分析函数

0: jdbc:hive2://localhost:10000> select * from t_access;
+----------------+---------------------------------+-----------------------+--------------+--+
| t_access.ip | t_access.url | t_access.access_time | t_access.dt |
+----------------+---------------------------------+-----------------------+--------------+--+
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 20170804 |
| 192.168.33.3 | http://www.xxx.ccc.aa/teach | 2017-08-04 15:35:20 | 20170804 |
| 192.168.33.4 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 20170804 |
| 192.168.33.4 | http://www.xxx.ccc.aa/job | 2017-08-04 16:30:20 | 20170804 |
| 192.168.33.5 | http://www.xxx.ccc.aa/job | 2017-08-04 15:40:20 | 20170804 |
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 20170805 |
| 192.168.44.3 | http://www.xxx.ccc.aa/teach | 2017-08-05 15:35:20 | 20170805 |
| 192.168.33.44 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 20170805 |
| 192.168.33.46 | http://www.xxx.ccc.aa/job | 2017-08-05 16:30:20 | 20170805 |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-05 15:40:20 | 20170805 |
| 192.168.133.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:30:20 | 20170806 |
| 192.168.111.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:35:20 | 20170806 |
| 192.168.34.44 | http://www.xxx.ccc.aa/pay | 2017-08-06 15:30:20 | 20170806 |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 20170806 |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 20170806 |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 20170806 |
| 192.168.33.25 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 20170806 |
| 192.168.33.36 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 20170806 |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 20170806 |
+----------------+---------------------------------+-----------------------+--------------+--+ ## LAG函数
select ip,url,access_time,
row_number() over(partition by ip order by access_time) as rn,
lag(access_time,1,0) over(partition by ip order by access_time)as last_access_time
from t_access; +----------------+---------------------------------+----------------------+-----+----------------------+--+
| ip | url | access_time | rn | last_access_time |
+----------------+---------------------------------+----------------------+-----+----------------------+--+
| 192.168.111.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:35:20 | 1 | 0 |
| 192.168.133.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:30:20 | 1 | 0 |
| 192.168.33.25 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 1 | 0 |
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | 0 |
| 192.168.33.3 | http://www.xxx.ccc.aa/teach | 2017-08-04 15:35:20 | 2 | 2017-08-04 15:30:20 |
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 3 | 2017-08-04 15:35:20 |
| 192.168.33.36 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 1 | 0 |
| 192.168.33.4 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | 0 |
| 192.168.33.4 | http://www.xxx.ccc.aa/job | 2017-08-04 16:30:20 | 2 | 2017-08-04 15:30:20 |
| 192.168.33.44 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 1 | 0 |
| 192.168.33.46 | http://www.xxx.ccc.aa/job | 2017-08-05 16:30:20 | 1 | 0 |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 2 | 2017-08-05 16:30:20 |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 3 | 2017-08-06 16:30:20 |
| 192.168.33.5 | http://www.xxx.ccc.aa/job | 2017-08-04 15:40:20 | 1 | 0 |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-05 15:40:20 | 1 | 0 |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 2 | 2017-08-05 15:40:20 |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 3 | 2017-08-06 15:40:20 |
| 192.168.34.44 | http://www.xxx.ccc.aa/pay | 2017-08-06 15:30:20 | 1 | 0 |
| 192.168.44.3 | http://www.xxx.ccc.aa/teach | 2017-08-05 15:35:20 | 1 | 0 |
+----------------+---------------------------------+----------------------+-----+----------------------+--+ ## LEAD函数
select ip,url,access_time,
row_number() over(partition by ip order by access_time) as rn,
lead(access_time,1,0) over(partition by ip order by access_time)as last_access_time
from t_access;
+----------------+---------------------------------+----------------------+-----+----------------------+--+
| ip | url | access_time | rn | last_access_time |
+----------------+---------------------------------+----------------------+-----+----------------------+--+
| 192.168.111.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:35:20 | 1 | 0 |
| 192.168.133.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:30:20 | 1 | 0 |
| 192.168.33.25 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 1 | 0 |
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | 2017-08-04 15:35:20 |
| 192.168.33.3 | http://www.xxx.ccc.aa/teach | 2017-08-04 15:35:20 | 2 | 2017-08-05 15:30:20 |
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 3 | 0 |
| 192.168.33.36 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 1 | 0 |
| 192.168.33.4 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | 2017-08-04 16:30:20 |
| 192.168.33.4 | http://www.xxx.ccc.aa/job | 2017-08-04 16:30:20 | 2 | 0 |
| 192.168.33.44 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 1 | 0 |
| 192.168.33.46 | http://www.xxx.ccc.aa/job | 2017-08-05 16:30:20 | 1 | 2017-08-06 16:30:20 |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 2 | 2017-08-06 16:30:20 |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 3 | 0 |
| 192.168.33.5 | http://www.xxx.ccc.aa/job | 2017-08-04 15:40:20 | 1 | 0 |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-05 15:40:20 | 1 | 2017-08-06 15:40:20 |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 2 | 2017-08-06 15:40:20 |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 3 | 0 |
| 192.168.34.44 | http://www.xxx.ccc.aa/pay | 2017-08-06 15:30:20 | 1 | 0 |
| 192.168.44.3 | http://www.xxx.ccc.aa/teach | 2017-08-05 15:35:20 | 1 | 0 |
+----------------+---------------------------------+----------------------+-----+----------------------+--+ ## FIRST_VALUE 函数
例:取每个用户访问的第一个页面
select ip,url,access_time,
row_number() over(partition by ip order by access_time) as rn,
first_value(url) over(partition by ip order by access_time rows between unbounded preceding and unbounded following)as last_access_time
from t_access;
+----------------+---------------------------------+----------------------+-----+---------------------------------+--+
| ip | url | access_time | rn | last_access_time |
+----------------+---------------------------------+----------------------+-----+---------------------------------+--+
| 192.168.111.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:35:20 | 1 | http://www.xxx.ccc.aa/register |
| 192.168.133.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:30:20 | 1 | http://www.xxx.ccc.aa/register |
| 192.168.33.25 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 1 | http://www.xxx.ccc.aa/job |
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | http://www.xxx.ccc.aa/stu |
| 192.168.33.3 | http://www.xxx.ccc.aa/teach | 2017-08-04 15:35:20 | 2 | http://www.xxx.ccc.aa/stu |
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 3 | http://www.xxx.ccc.aa/stu |
| 192.168.33.36 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 1 | http://www.xxx.ccc.aa/excersize |
| 192.168.33.4 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | http://www.xxx.ccc.aa/stu |
| 192.168.33.4 | http://www.xxx.ccc.aa/job | 2017-08-04 16:30:20 | 2 | http://www.xxx.ccc.aa/stu |
| 192.168.33.44 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 1 | http://www.xxx.ccc.aa/stu |
| 192.168.33.46 | http://www.xxx.ccc.aa/job | 2017-08-05 16:30:20 | 1 | http://www.xxx.ccc.aa/job |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 2 | http://www.xxx.ccc.aa/job |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 3 | http://www.xxx.ccc.aa/job |
| 192.168.33.5 | http://www.xxx.ccc.aa/job | 2017-08-04 15:40:20 | 1 | http://www.xxx.ccc.aa/job |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-05 15:40:20 | 1 | http://www.xxx.ccc.aa/job |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 2 | http://www.xxx.ccc.aa/job |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 3 | http://www.xxx.ccc.aa/job |
| 192.168.34.44 | http://www.xxx.ccc.aa/pay | 2017-08-06 15:30:20 | 1 | http://www.xxx.ccc.aa/pay |
| 192.168.44.3 | http://www.xxx.ccc.aa/teach | 2017-08-05 15:35:20 | 1 | http://www.xxx.ccc.aa/teach |
+----------------+---------------------------------+----------------------+-----+---------------------------------+--+ ## LAST_VALUE 函数
例:取每个用户访问的最后一个页面
select ip,url,access_time,
row_number() over(partition by ip order by access_time) as rn,
last_value(url) over(partition by ip order by access_time rows between unbounded preceding and unbounded following)as last_access_time
from t_access;
+----------------+---------------------------------+----------------------+-----+---------------------------------+--+
| ip | url | access_time | rn | last_access_time |
+----------------+---------------------------------+----------------------+-----+---------------------------------+--+
| 192.168.111.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:35:20 | 1 | http://www.xxx.ccc.aa/register |
| 192.168.133.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:30:20 | 1 | http://www.xxx.ccc.aa/register |
| 192.168.33.25 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 1 | http://www.xxx.ccc.aa/job |
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | http://www.xxx.ccc.aa/stu |
| 192.168.33.3 | http://www.xxx.ccc.aa/teach | 2017-08-04 15:35:20 | 2 | http://www.xxx.ccc.aa/stu |
| 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 3 | http://www.xxx.ccc.aa/stu |
| 192.168.33.36 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 1 | http://www.xxx.ccc.aa/excersize |
| 192.168.33.4 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | http://www.xxx.ccc.aa/stu |
| 192.168.33.4 | http://www.xxx.ccc.aa/job | 2017-08-04 16:30:20 | 2 | http://www.xxx.ccc.aa/stu |
| 192.168.33.44 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 1 | http://www.xxx.ccc.aa/stu |
| 192.168.33.46 | http://www.xxx.ccc.aa/job | 2017-08-05 16:30:20 | 1 | http://www.xxx.ccc.aa/job |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 2 | http://www.xxx.ccc.aa/job |
| 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 3 | http://www.xxx.ccc.aa/job |
| 192.168.33.5 | http://www.xxx.ccc.aa/job | 2017-08-04 15:40:20 | 1 | http://www.xxx.ccc.aa/job |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-05 15:40:20 | 1 | http://www.xxx.ccc.aa/job |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 2 | http://www.xxx.ccc.aa/job |
| 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 3 | http://www.xxx.ccc.aa/job |
| 192.168.34.44 | http://www.xxx.ccc.aa/pay | 2017-08-06 15:30:20 | 1 | http://www.xxx.ccc.aa/pay |
| 192.168.44.3 | http://www.xxx.ccc.aa/teach | 2017-08-05 15:35:20 | 1 | http://www.xxx.ccc.aa/teach |
+----------------+---------------------------------+----------------------+-----+---------------------------------+--+ /*
累计报表--分析函数实现版
*/
-- sum() over() 函数
select id
,month
,sum(amount) over(partition by id order by month rows between unbounded preceding and current row)
from
(select id,month,
sum(fee) as amount
from t_test
group by id,month) tmp;

Hive—简单窗口分析函数的更多相关文章

  1. hive中窗口分析函数

    分组统计 1. groups sets(field1,field2,field3, (field1,field2)) 样例如下: select dt,tenantCode,nvl(platform,' ...

  2. pyqt5之简单窗口的创建

    在学完tkinter后,发现tkinter在布局方面特别的不方便(Tkinter资料:http://effbot.org/tkinterbook/tkinter-index.htm),因此学习pyqt ...

  3. 雷林鹏分享:jQuery EasyUI 窗口 - 创建简单窗口

    jQuery EasyUI 窗口 - 创建简单窗口 创建一个窗口(window)非常简单,我们创建一个 DIV 标记: Some Content. 现在运行测试页面,您会看见一个窗口(window)显 ...

  4. OpenGL学习 (一) - 简单窗口绘制

    一.OpenGL 简介 OpenGL 本质: OpenGL(Open Graphics Library),通常可以认为是API,其包含了一系列可以操作图形.图像的函数.但深究下来,它是由Khronos ...

  5. Hive 窗口分析函数

    1.窗口函数 1.LAG(col,n,DEFAULT) 用于统计窗口内往上第n行值 第一个参数为列名,第二个参数为往上第n行(可选,默认为1),第三个参数为默认值(当往上第n行为NULL时候,取默认值 ...

  6. hive row_number等窗口分析函数

    一.排序&去重分析 row_number() over(partititon by col1 order by col2) as rn 结果:1,2,3,4 rank() over(parti ...

  7. Hive 窗口函数、分析函数

    1 分析函数:用于等级.百分点.n分片等 Ntile 是Hive很强大的一个分析函数. 可以看成是:它把有序的数据集合 平均分配 到 指定的数量(num)个桶中, 将桶号分配给每一行.如果不能平均分配 ...

  8. Windows程序设计笔记(二) 关于编写简单窗口程序中的几点疑惑

    在编写窗口程序时主要是5个步骤,创建窗口类.注册窗口类.创建窗口.显示窗口.消息环的编写.对于这5个步骤为何要这样写,当初我不是太理解,学习到现在有些问题我基本上已经找到了答案,同时对于Windows ...

  9. hive:排序分析函数

    基本排序函数 语法: rank()over([partition by col1] order by col2) dense_rank()over([partition by col1] order ...

随机推荐

  1. 如何制作Jar包并在android中调用jar包

    android制作jar包: 新建android工程,然后右击,点击导出,选择导出类型为Java下的JAR file,在java file specification 中不要选择androidmani ...

  2. 【Spring学习笔记-MVC-1.1--】@PathVariable与@RequestParam、@CookieValue等比较

    作者:ssslinppp       1. 摘要 本文结构如下: 2. @RequestMapping 通配符方式: 3. @PathVariable URL请求时,使用占位符: 4. @Reques ...

  3. python selenium 问题汇总

    FAQ 1.python+selenium+Safari浏览器,定位元素 selenium.common.exceptions.ElementNotVisibleException: Message: ...

  4. 1036 Boys vs Girls (25 分)

    1036 Boys vs Girls (25 分) This time you are asked to tell the difference between the lowest grade of ...

  5. application/xml 和 text/xml的区别

    application/xml and text/xml的区别 经常看到有关xml时提到"application/xml" 和 "text/xml"两种类型, ...

  6. Oracle跨库复制表结构

    1.首先建立远程连接 create public database link LINK_SJPSconnect to system identified by manager using '(DESC ...

  7. ZooKeeper系列(3)命令操作 (转)

    原文地址:http://www.cnblogs.com/wuxl360/p/5817524.html 一.Zookeeper的四字命令 Zookeeper支持某些特定的四字命令字母与其的交互.他们大多 ...

  8. faker模块基本用法

    引言: 自动化脚本编写时,一般会遇到需要构造数据的情况,比如注册时的基本信息:每次执行脚本都要重新构造数据显然是很费时费力的事情,所以可以用到faker模块来构造:方便快捷,神器也: 一.安装 pip ...

  9. THINKPHP3.2.3增加阿里云短信接口思路整理

    https://help.aliyun.com/document_detail/55359.html?spm=5176.product44282.4.7.O4lc1n 阿里云短信服务地址,感冒的下载看 ...

  10. OpenACC 绘制曼德勃罗集

    ▶ 书上第四章,用一系列步骤优化曼德勃罗集的计算过程. ● 代码 // constants.h ; ; ; ; const double xmin=-1.7; ; const double ymin= ...