方案一:请参考《数据库“行专列”操作---使用row_number()over(partition by 分组字段 [order by 排序字段])》,该方案是sqlserver,oracle,mysql,hive均适用的。

在hive中的方案分为以下两种方案:

创建测试表,并插入测试数据:

--hive 测试 行转列 collect_set collect_list
create table tommyduan_test(
gridid string,
height int,
cell string,
mrcount int,
weakmrcount int
); insert into tommyduan_test values('g1',1,'cell1',12,3);
insert into tommyduan_test values('g1',1,'cell2',22,3);
insert into tommyduan_test values('g1',1,'cell3',23,3);
insert into tommyduan_test values('g1',1,'cell4',1,3);
insert into tommyduan_test values('g1',1,'cell5',3,3);
insert into tommyduan_test values('g1',1,'cell6',4,3);
insert into tommyduan_test values('g1',1,'cell19',21,3); insert into tommyduan_test values('g2',1,'cell4',1,3);
insert into tommyduan_test values('g2',1,'cell5',3,3);
insert into tommyduan_test values('g2',1,'cell6',4,3);
insert into tommyduan_test values('g2',1,'cell19',21,3);

方案二:使用collect_set方案

注意:collect_set是一个set集合,不允许重复的记录插入

select gridid,height,collect_list(cell) cellArray,collect_list(mrcount) mrcountArray,collect_list(weakmrcount) weakmrcountArray
from (
select gridid,height,cell,mrcount,weakmrcount,row_number()over(partition by gridid,height order by mrcount desc) rn
from tommyduan_test
group by gridid,height,cell,mrcount,weakmrcount
) t10
where rn<4
group by gridid,height;
+---------+---------+-----------------------------+---------------+-------------------+--+
| gridid | height | cellarray | mrcountarray | weakmrcountarray |
+---------+---------+-----------------------------+---------------+-------------------+--+
| g1 | 1 | ["cell3","cell2","cell19"] | [23,22,21] | [3,3,3] |
| g2 | 1 | ["cell19","cell6","cell5"] | [21,4,3] | [3,3,3] |
+---------+---------+-----------------------------+---------------+-------------------+--+ select gridid,height,
(case when size(cellArray)>0 then cellArray[] else '-9999' end) as cell1,
(case when size(cellArray)>0 then mrcountArray[] else '-9999' end) as cell1_mrcount,
(case when size(cellArray)>0 then weakmrcountArray[] else '-9999' end) as cell1_weakmrcount,
(case when size(cellArray)>1 then cellArray[] else '-9999' end) as cell2,
(case when size(cellArray)>1 then mrcountArray[] else '-9999' end) as cell2_mrcount,
(case when size(cellArray)>1 then weakmrcountArray[] else '-9999' end) as cell2_weakmrcount,
(case when size(cellArray)>2 then cellArray[] else '-9999' end) as cell3,
(case when size(cellArray)>2 then mrcountArray[] else '-9999' end) as cell3_mrcount,
(case when size(cellArray)>2 then weakmrcountArray[] else '-9999' end) as cell3_weakmrcount
from
(
select gridid,height,collect_list(cell) cellArray,collect_list(mrcount) mrcountArray,collect_list(weakmrcount) weakmrcountArray
from (
select gridid,height,cell,mrcount,weakmrcount,row_number()over(partition by gridid,height order by mrcount desc) rn
from tommyduan_test
group by gridid,height,cell,mrcount,weakmrcount
) t10
where rn<4
group by gridid,height
) t12;
+---------+---------+---------+----------------+--------------------+--------+----------------+--------------------+---------+----------------+--------------------+--+
| gridid | height | cell1 | cell1_mrcount | cell1_weakmrcount | cell2 | cell2_mrcount | cell2_weakmrcount | cell3 | cell3_mrcount | cell3_weakmrcount |
+---------+---------+---------+----------------+--------------------+--------+----------------+--------------------+---------+----------------+--------------------+--+
| g1 | 1 | cell3 | 23 | 3 | cell2 | 22 | 3 | cell19 | 21 | 3 |
| g2 | 1 | cell19 | 21 | 3 | cell6 | 4 | 3 | cell5 | 3 | 3 |
+---------+---------+---------+----------------+--------------------+--------+----------------+--------------------+---------+----------------+--------------------+--+

方案三:使用collect_list/collect_all方案

注意:collect_set是一个set集合,不允许重复的记录插入

select gridid,height,collect_set(cell),collect_set(mrcount),collect_set(weakmrcount)
from (select * from tommyduan_test order by gridid,height,mrcount desc) t10
group by gridid,height;
+---------+---------+-------------------------------------------------------------+----------------------+------+--+
| gridid | height | _c2 | _c3 | _c4 |
+---------+---------+-------------------------------------------------------------+----------------------+------+--+
| g1 | 1 | ["cell3","cell2","cell19","cell1","cell6","cell5","cell4"] | [23,22,21,12,4,3,1] | [] |
| g2 | 1 | ["cell19","cell6","cell5","cell4"] | [21,4,3,1] | [] |
+---------+---------+-------------------------------------------------------------+----------------------+------+--+ select gridid,height,collect_set(cell) cellArray,collect_set(mrcount) mrcountArray,collect_set(weakmrcount) weakmrcountArray
from (
select gridid,height,cell,mrcount,weakmrcount,row_number()over(partition by gridid,height order by mrcount desc) rn
from tommyduan_test
group by gridid,height,cell,mrcount,weakmrcount
) t10
where rn<4
group by gridid,height;
+---------+---------+-----------------------------+---------------+-------------------+--+
| gridid | height | cellarray | mrcountarray | weakmrcountarray |
+---------+---------+-----------------------------+---------------+-------------------+--+
| g1 | 1 | ["cell3","cell2","cell19"] | [23,22,21] | [] |
| g2 | 1 | ["cell19","cell6","cell5"] | [21,4,3] | [] |
+---------+---------+-----------------------------+---------------+-------------------+--+ select gridid,height,collect_set(concat_ws(',',cell,cast(mrcount as string), cast(weakmrcount as string))) as cellArray
from (
select gridid,height,cell,mrcount,weakmrcount,row_number()over(partition by gridid,height order by mrcount desc) rn
from tommyduan_test
group by gridid,height,cell,mrcount,weakmrcount
) t10
where rn<4
group by gridid,height
+---------+---------+--------------------------------------------+--+
| gridid | height | cellarray |
+---------+---------+--------------------------------------------+--+
| g1 | 1 | ["cell3,23,3","cell2,22,3","cell19,21,3"] |
| g2 | 1 | ["cell19,21,3","cell6,4,3","cell5,3,3"] |
+---------+---------+--------------------------------------------+--+ select gridid,height,
(case when size(cellArray)>0 then split(cellArray[],'_')[] else '-9999' end) as cell1,
(case when size(cellArray)>0 then split(cellArray[],'_')[] else '-9999' end) as cell1_mrcount,
(case when size(cellArray)>0 then split(cellArray[],'_')[] else '-9999' end) as cell1_weakmrcount,
(case when size(cellArray)>1 then split(cellArray[],'_')[] else '-9999' end) as cell2,
(case when size(cellArray)>1 then split(cellArray[],'_')[] else '-9999' end) as cell2_mrcount,
(case when size(cellArray)>1 then split(cellArray[],'_')[] else '-9999' end) as cell2_weakmrcount,
(case when size(cellArray)>2 then split(cellArray[],'_')[] else '-9999' end) as cell3,
(case when size(cellArray)>2 then split(cellArray[],'_')[] else '-9999' end) as cell3_mrcount,
(case when size(cellArray)>2 then split(cellArray[],'_')[] else '-9999' end) as cell3_weakmrcount
from
(
select gridid,height,collect_set(concat_ws('_',cell,cast(mrcount as string), cast(weakmrcount as string))) as cellArray
from (
select gridid,height,cell,mrcount,weakmrcount,row_number()over(partition by gridid,height order by mrcount desc) rn
from tommyduan_test
group by gridid,height,cell,mrcount,weakmrcount
) t10
where rn<4
group by gridid,height
) t12;
+---------+---------+---------+----------------+--------------------+--------+----------------+--------------------+---------+----------------+--------------------+--+
| gridid | height | cell1 | cell1_mrcount | cell1_weakmrcount | cell2 | cell2_mrcount | cell2_weakmrcount | cell3 | cell3_mrcount | cell3_weakmrcount |
+---------+---------+---------+----------------+--------------------+--------+----------------+--------------------+---------+----------------+--------------------+--+
| g1 | 1 | cell3 | 23 | 3 | cell2 | 22 | 3 | cell19 | 21 | 3 |
| g2 | 1 | cell19 | 21 | 3 | cell6 | 4 | 3 | cell5 | 3 | 3 |
+---------+---------+---------+----------------+--------------------+--------+----------------+--------------------+---------+----------------+--------------------+--+

hive:数据库“行专列”操作---使用collect_set/collect_list/collect_all & row_number()over(partition by 分组字段 [order by 排序字段])的更多相关文章

  1. 数据库“行专列”操作---使用row_number()over(partition by 分组字段 [order by 排序字段])

    测试样例: create table test(rsrp string,rsrq string,tkey string,distan string); '); '); '); '); select * ...

  2. dos命令行连接操作ORACLE数据库

    C:\Adminstrator> sqlplus "/as sysdba" 查看是否连接到数据库 SQL> select status from v$instance; ...

  3. hive函数应用之操作json

    1.创建表 createtable.sql中存放的创建表语句如下 create external table adt.jsontest ( appKey string comment "AP ...

  4. Python(数据库之表操作)

    一.修改表 1. 修改表名 ALTER TABLE 表名 RENAME 新表名; #mysql中库名.表名对大小写不敏感 2. 增加字段 ALTER TABLE 表名ADD 字段名 数据类型 [完整性 ...

  5. SQL Server数据库--》top关键字,order by排序,distinct去除重复记录,sql聚合函数,模糊查询,通配符,空值处理。。。。

    top关键字:写在select后面 字段的前面 比如你要显示查询的前5条记录,如下所示: select top 5 * from Student 一般情况下,top是和order by连用的 orde ...

  6. hive 分组排序函数 row_number() over(partition by " " order by " "desc

    语法:row_number() over (partition by 字段a order by 计算项b desc ) rank --这里rank是别名 partition by:类似hive的建表, ...

  7. Hive数据库操作

    Hive数据结构 除了基本数据类型(与java类似),hive支持三种集合类型 Hive集合类型数据 array.map.structs hive (default)> create table ...

  8. 大数据开发实战:离线大数据处理的主要技术--Hive,概念,SQL,Hive数据库

    1.Hive出现背景 Hive是Facebook开发并贡献给Hadoop开源社区的.它是建立在Hadoop体系架构上的一层SQL抽象,使得数据相关人员使用他们最为熟悉的SQL语言就可以进行海量数据的处 ...

  9. HIVE的sql语句操作

    Hive 是基于Hadoop 构建的一套数据仓库分析系统,它提供了丰富的SQL查询方式来分析存储在hadoop 分布式文件系统中的数据,可以将结构 化的数据文件映射为一张数据库表,并提供完整的SQL查 ...

随机推荐

  1. FMDatabaseQueue 如何保证线程安全

    这篇文章原来在用 Github Pages 搭建的博客上,现在决定重新用回博客园,所以把文章搬回来. FMDB 是 OC 针对 sqlite 的封装.在其文档的线程安全部分这样讲:同时从多个线程使用同 ...

  2. spring boot 2.0.0由于版本不匹配导致的NoSuchMethodError问题解析

    spring boot升级到2.0.0以后,项目突然报出 NoSuchMethodError: org.springframework.boot.builder.SpringApplicationBu ...

  3. UITableViewStyleGrouped模式下多余间距

    第一个section上边多余间距处理 // 隐藏UITableViewStyleGrouped上边多余的间隔 _tableView.tableHeaderView = [[UIView alloc] ...

  4. 【Python】 基于秘钥的对称加密

    [Crypto] 关于用python进行信息的加密,类似的解决方案有很多比如用base64编码进行encode,再或者是hashlib来进行hash.但是还缺少一种明明场景很简单的解决方案,就是把利用 ...

  5. Linux创建普通用户以及权限的分配

    LINUX系统能创建一个普通用户,给开发人员让他们登录吗? 答案:可以. 怎么做? 答案:一般给开发 创建一个目录账户 他要做什么操作 就给什么权限 useradd命令 useradd可用来建立用户帐 ...

  6. [poj3280]Cheapest Palindrome_区间dp

    Cheapest Palindrome poj-3280 题目大意:给出一个字符串,以及每种字符的加入代价和删除代价,求将这个字符串通过删减元素变成回文字符串的最小代价. 注释:每种字符都是小写英文字 ...

  7. android中activity.this跟getApplicationContext的区别

    转载: http://www.myexception.cn/android/1968332.html android中activity.this和getApplicationContext的区别 在a ...

  8. 多线程使用Lock实现生产者实现者代码

    package cn.com.servyou.qysdsjx.thread; import java.util.ArrayList; import java.util.List; import jav ...

  9. [Java反射机制]用反射改进简单工厂模式设计

    如果做开发的工作,工厂设计模式大概都已经深入人心了,比较常见的例子就是在代码中实现数据库操作类,考虑到后期可能会有数据库类型变换或者迁移,一般都会对一个数据库的操作类抽象出来一个接口,然后用工厂去获取 ...

  10. 记录python接口自动化测试--利用unittest生成测试报告(第四目)

    前面介绍了是用unittest管理测试用例,这次看看如何生成html格式的测试报告 生成html格式的测试报告需要用到 HTMLTestRunner,在网上下载了一个HTMLTestRunner.py ...