hive复杂数据类型的用法

1、简单描述
2、测试

1、简单描述

arrays: ARRAY<data_type>
maps: MAP<primitive_type, data_type>
structs: STRUCT<col_name : data_type [COMMENT col_comment], ...>
union: UNIONTYPE<data_type, data_type, ...>

Hive 中对该类型的完全支持仍然不完整。如果 JOIN、WHERE 和 GROUP BY 子句中引用的 UNIONTYPE 字段的查询将会失败，Hive 没有定义语法来提取 UNIONTYPE 的 tag 或 value 字段。

复杂数据类型的构造函数：

构造函数	操作数	描述
map	(key1, value1, key2, value2, ...)	Creates a map with the given key/value pairs.
struct	(val1, val2, val3, ...)	Creates a struct with the given field values. Struct field names will be col1, col2, ....
named_struct	(name1, val1, name2, val2, ...)	Creates a struct with the given field names and values. (As of Hive 0.8.0.)
array	(val1, val2, ...)	Creates an array with the given elements.
create_union	(tag, val1, val2, ...)	Creates a union type with the value that is being pointed to by the tag parameter.

注：create_union 中的 tag 让我们知道 union 的哪一部分正在被使用。

复杂数据类型访问元素：

构造函数	操作数	描述
A[n]	A is an Array and n is an int	Returns the nth element in the array A. The first element has index 0. For example, if A is an array comprising of ['foo', 'bar'] then A[0] returns 'foo' and A[1] returns 'bar'.
M[key]	M is a Map<K, V> and key has type K	Returns the value corresponding to the key in the map. For example, if M is a map comprising of {'f' -> 'foo', 'b' -> 'bar', 'all' -> 'foobar'} then M['all'] returns 'foobar'.
S.x	S is a struct	Returns the x field of S. For example for the struct foobar {int foo, int bar}, foobar.foo returns the integer stored in the foo field of the struct.

2、测试

-- ------------------------------ ARRAY ------------------------------

-- ARRAY<data_type>

create table arraytest (id int,info array<string>)

row format delimited

fields terminated by '\t'

collection items terminated by ','

stored as textfile;

-- 不要忽略`collection items terminated by ','

-- 它表示数组元素间的分隔符

-- 如果忽略了输出是这样的：

hive> select * from arraytest;

OK

1       ["zhangsan,male"]

2       ["lisi,male"]

-- 数据

1	zhangsan,male

2	lisi,male

-- 导入

load data local inpath '/root/data/arraytest.txt' into table arraytest;

-- 查看

hive> select * from arraytest;

OK

1       ["zhangsan","male"]

2       ["lisi","male"]

-- 索引查看数组元素

hive> select id,info[0] from arraytest;

OK

1       zhangsan

2       lisi

-- 将数组的所有元素展开输出

hive> select explode(info) from arraytest;

OK

zhangsan

male

lisi

male

-- ------------------------------ MAP ------------------------------

-- MAP<primitive_type, data_type>

create table maptest (id int,info map<string,string>)

row format delimited

fields terminated by '\t'

collection items terminated by ','

map keys terminated by ':'

stored as textfile;

-- 不要忽略`map keys terminated by ':'

-- 它表示键值间的分隔符

-- 数据

1	name:zhangsan,sex:male

2	name:lisi,sex:male

-- 导入

load data local inpath '/root/data/maptest.txt' into table maptest;

-- 查看

hive> select * from maptest;

OK

1       {"name":"zhangsan","sex":"male"}

2       {"name":"lisi","sex":"male"}

-- 查看map元素

hive> select id,info["name"] from maptest;

OK

1       zhangsan

2       lisi

-- ------------------------------ STRUCT ------------------------------

-- STRUCT<col_name : data_type [COMMENT col_comment], ...>

create table structtest (id int,info struct<name:string,sex:string>)

row format delimited

fields terminated by '\t'

collection items terminated by ','

stored as textfile;

-- 数据

1	zhangsan,male

2	lisi,male

-- 导入

load data local inpath '/root/data/structtest.txt' into table structtest;

-- 查看

hive> select * from structtest;

OK

1       {"name":"zhangsan","sex":"male"}

2       {"name":"lisi","sex":"male"}

hive> select id,info.name from structtest;

OK

1       zhangsan

2       lisi

-- ------------------------------ 综合array\map\struct ------------------------------

create table alltest(

    id int,

    name string,

    salary bigint,

    sub array<string>,

    details map<string, int>,

    address struct<city:string, state:string, pin:int>

)

row format delimited

fields terminated by ','

collection items terminated by '$'

map keys terminated by '#'

stored as textfile;

-- 数据

1,abc,40000,a$b$c,pf#500$epf#200,hyd$ap$500001

2,def,3000,d$f,pf#500,bang$kar$600038

4,abc,40000,a$b$c,pf#500$epf#200,bhopal$MP$452013

5,def,3000,d$f,pf#500,Indore$MP$452014

-- 导入数据

load data local inpath '/root/data/alltest.txt' into table alltest;

-- 查看

hive> select * from alltest;

OK

1       abc     40000   ["a","b","c"]   {"pf":500,"epf":200}    {"city":"hyd","state":"ap","pin":500001}

2       def     3000    ["d","f"]       {"pf":500}      {"city":"bang","state":"kar","pin":600038}

4       abc     40000   ["a","b","c"]   {"pf":500,"epf":200}    {"city":"bhopal","state":"MP","pin":452013}

5       def     3000    ["d","f"]       {"pf":500}      {"city":"Indore","state":"MP","pin":452014}

-- ------------------------------ UNIONTYPE ------------------------------

-- create_union(tag, val1, val2, ...)

-- Creates a union type with the value that is being pointed to by the tag parameter. 

-- ---- 简单示例：里面都是基本类型 ------

create table uniontest(

    id int,

    info uniontype<string,string>

)

row format delimited

fields terminated by '\t'

collection items terminated by ','

stored as textfile;

-- 插入数据：insert into

-- tag 索引后面的值是从 0 开始的

insert into table uniontest

    values

    (1,create_union(0,"zhangsan","male")),  -- 使用 "zhangsan"

    (1,create_union(1,"zhangsan","male")),  -- 使用 "male"

    (2,create_union(0,"lisi","female")),

    (2,create_union(1,"lisi","female"));

-- 查看

hive> select * from uniontest;

OK

1       {0:"zhangsan"}

1       {1:"male"}

2       {0:"lisi"}

2       {1:"female"}

-- 数据

1	0,zhangsan

1	1,male

2	0,lisi

2	1,female

-- 插入数据：load data

load data local inpath '/root/data/uniontest.txt' into table uniontest;

-- 查看

hive> select * from uniontest;

OK

1       {0:"zhangsan"}

1       {1:"male"}

2       {0:"lisi"}

2       {1:"female"}

-- 如果数据格式是这样的：

-- 1	0,zhangsan,male

-- 1	1,zhangsan,male

-- 2	0,lisi,female

-- 2	1,lisi,female

-- 会把后面的字符串当作一个整体，输出：

-- 1       {0:"zhangsan,male"}

-- 1       {1:"zhangsan,male"}

-- 2       {0:"lisi,female"}

-- 2       {1:"lisi,female"}

-- ---- 复杂示例：里面包含复杂类型 ------

create table uniontest_comp(

    id int,

    info uniontype<int,

                   string,

                   array<string>,

                   map<string,string>,

                   struct<sex:string,age:string>>

)

row format delimited

fields terminated by '\t'

collection items terminated by ','

stored as textfile;

-- 插入数据

-- 也可以使用 `insert into table ....select ....`

insert into table uniontest_comp

    values

    (1,create_union(0,1,"zhangsan",array("male","33"),map("sex","male","age","33"),named_struct("sex","male","age","33"))),

    (1,create_union(1,1,"zhangsan",array("male","33"),map("sex","male","age","33"),named_struct("sex","male","age","33"))),

    (1,create_union(2,1,"zhangsan",array("male","33"),map("sex","male","age","33"),named_struct("sex","male","age","33"))),

    (1,create_union(3,1,"zhangsan",array("male","33"),map("sex","male","age","33"),named_struct("sex","male","age","33"))),

    (1,create_union(4,1,"zhangsan",array("male","33"),map("sex","male","age","33"),named_struct("sex","male","age","33")));

-- 查看

hive> select * from uniontest_comp;

OK

1       {0:1}

1       {1:"zhangsan"}

1       {2:["male","33"]}

1       {3:{"sex":"male","age":"33"}}

1       {4:{"sex":"male","age":"33"}}

参考：http://querydb.blogspot.com/2015/11/hive-complex-data-types.html

hive复杂数据类型的用法的更多相关文章

大数据时代的技术hive：hive的数据类型和数据模型
在上篇文章里,我列举了一个简单的hive操作实例,创建了一张表test,并且向这张表加载了数据,这些操作和关系数据库操作类似,我们常把hive和关系数据库进行比较,也正是因为hive很多知识点和关系数 ...
day01-day04总结- Python 数据类型及其用法
Python 数据类型及其用法: 本文总结一下Python中用到的各种数据类型,以及如何使用可以使得我们的代码变得简洁. 基本结构我们首先要看的是几乎任何语言都具有的数据类型,包括字符串.整型.浮点 ...
Hive 5、Hive 的数据类型和 DDL Data Definition Language)
官方帮助文档:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL Hive的数据类型 -- 扩展数据类型data_t ...
hadoop笔记之Hive的数据类型
Hive的数据类型 Hive的数据类型前面说过,Hive是一个数据仓库,相当于一个数据库.既然是数据库,那么就必须能创建表,既然有表,那么当中就有列,列中就有对应的类型总的来讲,hive的数据类型 ...
Hive之数据类型
Hive之数据类型 (本文是基于多篇文章根据个人理解进行的整合,参考的文章见末尾的整理) 数据类型 Hive支持两种数据类型,一类叫原子数据类型,一类叫复杂数据类型.原子数据类型包括数值型.布尔型 ...
Hive 复杂数据类型的使用
Hive复杂数据类型 1.Array数据类型的使用 1.1.创建数据库表,以array作为数据类型 hive (hive_demo1)> create table stu_test(name a ...
《Hive编程指南》读书笔记 | 一文看懂Hive的数据类型和文件格式
Hive支持关系型数据库中的大多数基本数据类型,同时也支持关系型数据库中很少出现的3种集合数据类型. 和大多数数据库相比,Hive具有一个独特的功能,那就是其对于数据在文件中的编码方式具有非常大的灵活 ...
Mybatis中动态SQL语句中的parameterType不同数据类型的用法
Mybatis中动态SQL语句中的parameterType不同数据类型的用法1. 简单数据类型, 此时#{id,jdbcType=INTEGER}中id可以取任意名字如#{a,jdbcType ...
【Kylin实战】Hive复杂数据类型与视图
1. 引言在分析广告日志时,会有这样的多维分析需求: 曝光.点击用户分别有多少? 标签能覆盖多少广告用户? 各个标签(标注)类别能覆盖的曝光.点击在各个DSP上所覆盖的用户数 -- 广告数据与标签数 ...

随机推荐

Codeforces Round #651 (Div. 2) C. Number Game（数论）
题目链接:https://codeforces.com/contest/1370/problem/C 题意给出一个正整数 $n$,Ashishgup 和 FastestFinger 依次选择执行以下 ...
P1251 餐巾计划 (网络流)
题意:餐厅每天会需要用Ri块新的餐巾用完后也会产生Ri块旧的餐巾每天购买新的餐巾单价p元每天产出的旧餐巾可以送到快洗部花费每张c1元在i + v1天可以使用也可以花费c2元每张送到慢洗部在 ...
LINUX - Libevent
参考: https://dulishu.top/libevent-event-loop/
在Python中使用BeautifulSoup进行网页爬取
目录什么是网页抓取? 为什么我们要从互联网上抓取数据? 网站采集合法吗? HTTP请求/响应模型创建网络爬虫步骤1:浏览并检查网站/网页步骤2:创建用户代理步骤3:导入请求库检查状态码步 ...
python工业互联网应用实战6—任务分解
根据需求定义"任务"是一个完整的业务搬运流程,整个流程涉及到多个机构(设备)分别动作执行多个步骤,所以依据前面的模型设计,需要把任务分解到多个连续的子任务(作业),未来通过顺序串联 ...
sql-libs(1) -字符型注入
关于sql-libs的安装就不做过多的说明, 环境:win7虚拟机 192.168.48.130(NAT连接),然后用我的win10物理机去访问. 直接加 ' 报错,后测试 and '1'='1 成功 ...
Python+OpenCV+图片旋转并用原底色填充新四角
import cv2 from math import fabs, sin, cos, radians import numpy as np from scipy.stats import mode ...
Mac 外接 Dell 4K 显示器字体模糊解决办法
Mac 外接 Dell 4K 显示器字体模糊解决办法 mac mini mbp 2020 refs https://zhuanlan.zhihu.com/p/52100804 xgqfrms 2012 ...
Apple Support
Apple Support Send Files to Apple Support https://gigafiles.apple.com/#/customerupload refs 无法截屏 bug ...
HTML5 动效
HTML5 动效 motion graphics toolbelt for the web https://github.com/xgqfrms/mojs A collection of loadin ...

hive复杂数据类型的用法

1、简单描述

2、测试

hive复杂数据类型的用法的更多相关文章

随机推荐

热门专题