oracle进阶之分析函数

　　本博客是自己在学习和工作途中的积累与总结，纯属经验之谈，仅供自己参考，也欢迎大家转载，转载时请注明出处。　　

　　http://www.cnblogs.com/king-xg/p/6797119.html

　　分析函数提供了跨行，多层级聚合引用值的能力，并且可以在数据子集中空值排序粒度。与分组函数不同的是，分析函数并不将结果集聚合为较少的行。

　　而且分析函数的查询速度比传统sql查询会快很多。

　　使用分析函数可以在不适用任何自连接的情况下得到一行中聚合和未聚合的值。

　　示例数据:

-- 创建销售表

create table product_cost(

       id number(18) primary key,

       year number(4),

       month number(2),

       pid number(18),

       countSum number(18)

);

-- 创建产品表

create table product(

       id number(18) primary key,

       pname varchar(100),

       price number(8,2)

);

-- 初始化产品表记录

insert into product (pname,price) values('i7-6700K','');

insert into product (pname,price) values('i7-6600K','');

insert into product (pname,price) values('i7-6500K','');

insert into product (pname,price) values('i7-6400K','');

insert into product (pname,price) values('i7-6300K','');

insert into product (pname,price) values('i7-6200K','');

insert into product (pname,price) values('i7-6100K','');

-- 初始化销售表记录

insert into product_cost(year,month,pid,countSum) values(2000,1,1,500);

insert into product_cost(year,month,pid,countSum) values(2000,1,2,630);

insert into product_cost(year,month,pid,countSum) values(2000,1,3,1200);

insert into product_cost(year,month,pid,countSum) values(2000,1,4,320);

insert into product_cost(year,month,pid,countSum) values(2000,1,5,250);

insert into product_cost(year,month,pid,countSum) values(2000,1,6,250);

insert into product_cost(year,month,pid,countSum) values(2000,1,7,350);

insert into product_cost(year,month,pid,countSum) values(2000,2,1,1500);

insert into product_cost(year,month,pid,countSum) values(2000,2,2,1630);

insert into product_cost(year,month,pid,countSum) values(2000,2,3,200);

insert into product_cost(year,month,pid,countSum) values(2000,2,4,1320);

insert into product_cost(year,month,pid,countSum) values(2000,2,5,250);

insert into product_cost(year,month,pid,countSum) values(2000,2,6,350);

insert into product_cost(year,month,pid,countSum) values(2000,2,7,520);

insert into product_cost(year,month,pid,countSum) values(2000,3,1,520);

insert into product_cost(year,month,pid,countSum) values(2000,3,2,660);

insert into product_cost(year,month,pid,countSum) values(2000,3,3,1900);

insert into product_cost(year,month,pid,countSum) values(2000,3,4,300);

insert into product_cost(year,month,pid,countSum) values(2000,3,5,210);

insert into product_cost(year,month,pid,countSum) values(2000,3,6,210);

insert into product_cost(year,month,pid,countSum) values(2000,3,7,320);

insert into product_cost(year,month,pid,countSum) values(2000,4,1,1520);

insert into product_cost(year,month,pid,countSum) values(2000,4,2,1660);

insert into product_cost(year,month,pid,countSum) values(2000,4,3,2900);

insert into product_cost(year,month,pid,countSum) values(2000,4,4,1200);

insert into product_cost(year,month,pid,countSum) values(2000,4,5,980);

insert into product_cost(year,month,pid,countSum) values(2000,4,6,910);

insert into product_cost(year,month,pid,countSum) values(2000,4,7,620);

insert into product_cost(year,month,pid,countSum) values(2001,1,1,500);

insert into product_cost(year,month,pid,countSum) values(2001,1,2,630);

insert into product_cost(year,month,pid,countSum) values(2001,1,3,1200);

insert into product_cost(year,month,pid,countSum) values(2001,1,4,320);

insert into product_cost(year,month,pid,countSum) values(2001,1,5,150);

insert into product_cost(year,month,pid,countSum) values(2001,1,6,250);

insert into product_cost(year,month,pid,countSum) values(2001,1,7,350);

insert into product_cost(year,month,pid,countSum) values(2001,2,1,1500);

insert into product_cost(year,month,pid,countSum) values(2001,2,2,1630);

insert into product_cost(year,month,pid,countSum) values(2001,2,3,200);

insert into product_cost(year,month,pid,countSum) values(2001,2,4,1320);

insert into product_cost(year,month,pid,countSum) values(2001,2,5,250);

insert into product_cost(year,month,pid,countSum) values(2001,2,6,350);

insert into product_cost(year,month,pid,countSum) values(2001,2,7,450);

insert into product_cost(year,month,pid,countSum) values(2001,3,1,520);

insert into product_cost(year,month,pid,countSum) values(2001,3,2,660);

insert into product_cost(year,month,pid,countSum) values(2001,3,3,1900);

insert into product_cost(year,month,pid,countSum) values(2001,3,4,300);

insert into product_cost(year,month,pid,countSum) values(2001,3,5,180);

insert into product_cost(year,month,pid,countSum) values(2001,3,6,210);

insert into product_cost(year,month,pid,countSum) values(2001,3,7,320);

insert into product_cost(year,month,pid,countSum) values(2001,4,1,1520);

insert into product_cost(year,month,pid,countSum) values(2001,4,2,1660);

insert into product_cost(year,month,pid,countSum) values(2001,4,3,2900);

insert into product_cost(year,month,pid,countSum) values(2001,4,4,1200);

insert into product_cost(year,month,pid,countSum) values(2001,4,5,980);

insert into product_cost(year,month,pid,countSum) values(2001,4,6,910);

insert into product_cost(year,month,pid,countSum) values(2001,4,7,620);

数据展示:

-- 常用分析函数列表

. leg            -- 访问一个分区或结果集中的前一行

. lead           -- 访问一个分区或结果集中的后一行

. first_value    -- 访问一个分区或结果集中的第一行

. last_value     -- 访问一个分区或结果集中的最后一行

. nth_value      -- 访问一个分区或结果集的指定行

. rank           -- 将数据行值按照排序后的顺序进行排名，在有并列的情况下排名值将被跳过

. dense_rank     -- 将数据行值按照排序后的顺序进行排名，在有并列的情况下也不会跳过排名

. row_number     -- 对行进行排序，并为每一行赋予一个随机且唯一的编号

. ratio_to_report -- 计算报告中值的比例

. percent_rank    -- 将计算得到的排名标准化为0到1之间的值

. ntile          -- 对每一个分区进行再分组，并为每一个分组提供一个唯一标识(仅在本分区中唯一)，每组的数据行数为指定值，但每组之间最多相差一个数据行

. listagg        -- 将来自不同行的列值转化为列表格式

分析函数的组成:
    (1) 指定列或范围解释: 分析函数一般带两个括号，第一个就是指定该函数所作用的列
    (2) 分组             解释: 1. 将数据分区，关键字 partition by，有点像group by，但却有很大不同，group by分组后的列只存在唯一值，不存在等同的值(就像是将数据分类，只显示类型名称一样),而partition by 是将数据原封不动根据后面给的字段进行划分边界形成分区(即每个分区都存在边界)
                            2. group by 针对整个表或数据集，partition by 针对于每一行的记录
    (3) 排序             解释: order by就是排序
    (4) 窗口控制        解释: 1.控制边界的范围，默认是 rows between unbounded preceding and current row(起始边界到当前行)，rows between unbounded preceding and unbounded following(整个分区),以及自定义的边界范围 rows between [number] preceding and [number] following(起始边界是该行的前xx行结束边界是该行的后xx行)
                            2.在没有分区的情况的，窗口控制会在一定程度上起到分区作用

　举例:

　　1. (leg函数) 查询每一个产品的月销售总额与前一个月的对比

 select pc.id as id, pc.year as year,pc.month as month,p.pname as pname,p.price as price,pc.countsum as countSum,(pc.countsum*p.price) as currentAmt,lag(pc.countsum*p.price,,pc.countsum*p.price) over(partition by year,pc.pid order by year,month,pid) as beforeAmt

   from    product_cost pc

   left join product p on p.id=pc.pid

　　注意: leg分析函数中的字段区域，不能用别名（不能写"currentAmt"只能写"pc.countsum*p.price"），会报标识符无效的异常，除非用子查询，就能用别名,原因:select 同一层级，解析在同一时间，oracle是认识别名标注的字段，除非在解析分析函数之前，即子查询

　　2. (lead函数)，同上就是查询的方向由向上查询改成向下查询而已。

　　3. (first_value函数) 统计每个月月销量第一的产品

select pc.id as id, pc.year as year,pc.month as month,p.pname as pname,p.price as price,pc.countsum as countSum,(p.price*pc.countsum) as currentAmt,first_value(p.price*pc.countsum) over(partition by year,month order by year,month,p.price*pc.countsum DESC) as sum

   from product_cost pc

   left join product p on pc.pid=p.id

　　聚合函数也可以应用到里面，上面的sql可以换成下面的

select pc.id as id, pc.year as year,pc.month as month,p.pname as pname,p.price as price,pc.countsum as countSum,(p.price*pc.countsum) as currentAmt,max(p.price*pc.countsum) over(partition by year,month order by year,month,p.price*pc.countsum DESC) as sum

   from product_cost pc

   left join product p on pc.pid=p.id

4. last_value函数，同上，查询分区中排序或不排序的最后一条记录

5. nth_value函数，查询指定行的记录,比如: 计算当月销量第二名的产品销量总额

select pc.id as id, pc.year as year,pc.month as month,p.pname as pname,p.price as price,pc.countsum as countSum,(p.price*pc.countsum) as currentAmt,nth_value(p.price*pc.countsum,) over(partition by year,month order by year,month,p.price*pc.countsum DESC  rows between unbounded preceding and unbounded following) as sum

   from product_cost pc

   left join product p on pc.pid=p.id

6. rank函数，排名函数之一，特点：同名(排名)跳排，排名依据为order by 后的字段值，例子:对每个月的产品进行排名，依据月销量进行排名

 select pc.id as id, pc.year as year,pc.month as month,p.pname as pname,p.price as price,pc.countsum as countSum,rank() over(partition by year,month order by pc.countsum DESC) as orderName

   from product_cost pc

   left join product p on pc.pid=p.id

7. dense_rank函数，排名函数之一，特点：同名不跳排，排名依据order by后的字段值，例子:对每个月的产品进行排名，依据月销量进行排名

 select pc.id as id, pc.year as year,pc.month as month,p.pname as pname,p.price as price,pc.countsum as countSum,dense_rank() over(partition by year,month order by pc.countsum DESC) as orderName

   from product_cost pc

   left join product p on pc.pid=p.id

小结: rank函数和dense_rank函数的相同点，排序的字段值相同，则排名相同，唯一不同点就是rank会跳排，根据相同值的数量而定，跳排数为相等值的数量-1，dense_rank函数不跳排，无论存在多少相等的值

8. row_number函数，为数据行分配一个唯一标识，举例:

 select pc.id as id, pc.year as year,pc.month as month,p.pname as pname,p.price as price,pc.countsum as countSum,row_number() over(partition by year,month order by pc.countsum DESC) as orderName

   from product_cost pc

   left join product p on pc.pid=p.id

9. ratio_to_report函数，计算值在该分区或整个表或数据集中所占比例

   select pc.id as id, pc.year as year,pc.month as month,p.pname as pname,p.price as price,pc.countsum as countSum,trunc(*ratio_to_report(p.price*pc.countsum) over(partition by year,month),)||'%' as proportion

   from product_cost pc

   left join product p on pc.pid=p.id

   order by year,month,p.price*pc.countsum DESC

注意: 在ratio_to_report函数中，不能使用排序，只能在外层使用order by

10. percent_rank函数，计算得到的排名，且排名值为0-1之间的数(排名值越低，排名越高)

    select pc.id as id, pc.year as year,pc.month as month,p.pname as pname,p.price as price,pc.countsum as countSum,percent_rank() over(partition by year,month order by countsum DESC) as proportion

   from product_cost pc

   left join product p on pc.pid=p.id

11. ntile函数，特点:方便将有问题的数据放到一个统一的容器中，举例：领导发话说客户不想看到这些销量低于300的数据记录

   select pc.id as id, pc.year as year,pc.month as month,p.pname as pname,p.price as price,pc.countsum as countSum,ntile() over(partition by year,month order by countsum DESC) as proportion

   from product_cost pc

   left join product p on pc.pid=p.id

12. listagg函数拥有将列值转化成列表格的能力

   select listagg(p.pname,',') within group (order by p.id) as str

   from product p

// 结果: i7-6700K,i7-6600K,i7-6500K,i7-6400K,i7-6300K,i7-6200K,i7-6100K

顺便小结一下，字符串的拼接方法:

(1). 自定义函数拼接字符串

　(2). "||" 符合进行拼接
(3). wmsys.wm_concat函数拼接字符串
(4). listagg函数拼接字符串

oracle进阶之分析函数的更多相关文章

利用Oracle内置分析函数进行高效统计汇总
分析函数是Oracle从8.1.6开始引入的一个新的概念,为我们分析数据提供了一种简单高效的处理方式.在分析函数出现以前,我们必须使用自联查询,子查询或者内联视图,甚至复杂的存储过程实现的语句,现 ...
Oracle 中的分析函数
Oracle常用分析函数介绍(排名函数+窗口函数) 2014年11月30日 ⁄ 数据库 ⁄ 共 3903字 ⁄ 暂无评论 ⁄ 阅读 7,772 次评级函数常见评级函数如下: RANK():返回数据 ...
【Oracle】oracle之listagg分析函数
oracle分析函数——listagg篇 (1)使用listagg将多行数据合并到一行例表: select deptno, ename from emp order by deptno, ename ...
oracle累积求和分析函数sum over的使用
oracle sum()over函数的使用 over不能单独使用,要和分析函数:rank(),dense_rank(),row_number()等一起使用. over函数的参数:over(partit ...
Oracle进阶研究问题收集
1. buffer busy waits http://www.itpub.net/thread-1801066-1-4.html 2. 深入理解oracle log buffer http://ww ...
求学生单科流水表中单科最近/最新的考试成绩表的新增可行性方案使用Oracle提供的分析函数rank
在 https://www.cnblogs.com/xiandedanteng/p/12327809.html 一文中,提到了三种求学生单科最新成绩的SQL可行性方案,这里还有另一种实现,那就是利用分 ...
oracle进阶之connect by笔记
本博客是自己在学习和工作途中的积累与总结,仅供自己参考,也欢迎大家转载,转载时请注明出处. http://www.cnblogs.com/king-xg/p/6794562.html 如果觉得对您有帮 ...
oracle 进阶之model子句
本博客是自己在学习和工作途中的积累与总结,仅供自己参考,也欢迎大家转载,转载时请注明出处. http://www.cnblogs.com/king-xg/p/6692841.html 一, mode ...
Oracle row_number() over() 分析函数--取出最新数据
语法格式:row_number() over(partition by 分组列 order by 排序列 desc) 一个很简单的例子 1,先做好准备 create table test1( id v ...

随机推荐

《Spring1之第四次站立会议》
<第四次站立会议> 昨天:我把小组成员找到的写关于登录界面的代码加到了我的项目工程里,并对它有了一定的了解,已经能够编译运行了,得到了登陆的界面: 今天:试着做了一下主框架里的在线人数的显 ...
java集合LinkedList
基于jdk_1.8.0 关于List,主要是有序的可重复的数据结构.jdk主要实现类有ArrayList(底层使用数组).LinkedList(底层使用双向链表) LinkedList: (一)继承关 ...
Java面试& HashMap实现原理分析
1. HashMap的数据结构数据结构中有数组和链表来实现对数据的存储,但这两者基本上是两个极端. 数组数组存储区间是连续的,占用内存严重,故空间复杂的很大.但数组的二分查找时间复杂度小,为O( ...
KEIL C51程序中如何嵌入汇编
模块内接口:使用如下标志符:#pragma asm汇编语句#pragma endasm注意:如果在c51程序中使用了汇编语言,注意在Keil编译器中需要激活Properties中的“Generate ...
第二版_TestNG+Excel+(HTTP+JSON) 简单接口测试
---------------------------------------------------------------------------------------------------- ...
git add -A 和 git add . 的区别
git add -A和 git add . git add -u在功能上看似很相近,但还是存在一点差别 git add . :他会监控工作区的状态树,使用它会把工作时的所有变化提交到暂存区,包括文 ...
[转帖]Kerberos简介
1. Kerberos简介 https://www.cnblogs.com/wukenaihe/p/3732141.html 1.1. 功能一个安全认证协议用tickets验证避免本地保存密码 ...
Powershell笔记之MVA课程
很早之前看过MVA的Powershell课程,最近准备回顾一下,还是有一些意外的收获. <<快速入门 : PowerShell 3.0 高级工具和脚本>> 1. Invoke- ...
一个flume agent异常的解决过程记录
今天在使用flume agent的时候,遇到了一个异常, 现把解决的过程记录如下: 问题的背景: 我使用flume agent 来接收从storm topology发送下来的accesslog , ...
使用 Idea 打 scala程序的 jar 包 - 02
Artifact ——>+ ——>JAR ——>From modules with dependencies 选择 Module,选择主函数,OK——>OK 勾选Includ ...

oracle进阶之分析函数

oracle进阶之分析函数的更多相关文章

随机推荐

热门专题