大数据学习——hive数据类型

o_0的园子 2024-10-31 06:28:54 原文

1. hive的数据类型
Hive的内置数据类型可以分为两大类：(1)、基础数据类型；(2)、复杂数据类型
2. hive基本数据类型
基础数据类型包括：TINYINT,SMALLINT,INT,BIGINT,BOOLEAN,FLOAT,DOUBLE,STRING,BINARY,TIMESTAMP,DECIMAL,CHAR,VARCHAR,DATE。

3. hive集合类型
集合类型主要包括：array，map，struct等，hive的特性支持集合类型，这特性是关系型数据库所不支持的，利用好集合类型可以有效提升SQL的查询速率。
3.1 集合类型之array
(1) 先创建一张表

create table t_array(id int,name string,hobby array<string>)

row format delimited

fields terminated by ','

collection items terminated by '-';

(2) 准备数据文件 array.txt

1,zhangsan,唱歌-跳舞-游泳

2,lisi,打游戏-篮球

(2) 加载数据文件到t_array表中

load data local inpath '/root/array.txt' into table t_array;

(3) 查询数据

select id ,name,hobby[],hobby[] from t_array;

注意：array的访问元素和java中是一样的，这里通过索引来访问。

3.2 集合类型之map

(1) 先创建一张表

create table t_map(id int,name string,hobby map<string,string>)

row format delimited

fields terminated by ','

collection items terminated by '-'

map keys terminated by ':' ;

(2) 准备数据文件 map.txt

1,zhangsan,唱歌:非常喜欢-跳舞:喜欢-游泳:一般般

2,lisi,打游戏:非常喜欢-篮球:不喜欢

(2) 加载数据文件到t_map表中

load data local inpath '/root/map.txt' into table t_map;

(3) 查询数据

select id,name,hobby['唱歌'] from t_map;

注意：map的访问元素中的value和java中是一样的，这里通过key来访问。

3.3集合类型之struct

(1) 先创建一张表

create table t_struct(id int,name string,address struct<country:string,city:string>)

row format delimited

fields terminated by ','

collection items terminated by '-';

(2) 准备数据文件 struct.txt

1,zhangsan,china-beijing

2,lisi,USA-newyork

(2) 加载数据文件到t_struct表中

load data local inpath  '/root/struct.txt'  into table t_struct;

(3) 查询数据

select id,name,address.country,address.city from t_struct;

总结：struct访问元素的方式是通过.符号

大数据学习——hive数据类型的更多相关文章

大数据学习——hive基本操作
1 建表 create table student(id int,name string ,age int) row format delimitedfields terminated by ','; ...
大数据学习——hive函数
1 内置函数测试各种内置函数的快捷方法: 1.创建一个dual表 create table dual(id string); 2.load一个文件(一行,一个空格)到dual表 3.select s ...
大数据学习——hive的sql练习
1新建一个数据库 create database db3; 2创建一个外部表 --外部表建表语句示例: create external table student_ext(Sno int,Sname ...
大数据学习——hive显示命令
show databases; desc t_partition001; desc extended t_partition002; desc formatted t_partition002; !c ...
大数据学习——hive安装部署
1上传压缩包 2 解压 tar -zxvf apache-hive-1.2.1-bin.tar.gz -C apps 3 重命名 mv apache-hive-1.2.1-bin hive 4 设置环 ...
大数据学习——hive的sql练习题
ABC三个hive表每个表中都只有一列int类型且列名相同,求三个表中互不重复的数 create table a(age int) row format delimited fields termi ...
大数据学习——hive数仓DML和DDL操作
1 创建一个分区表 create table t_partition001(ip string,duration int) partitioned by(country string) row for ...
大数据学习——hive使用
Hive交互shell bin/hive Hive JDBC服务 hive也可以启动为一个服务器,来对外提供启动方式,(假如是在itcast01上): 启动为前台:bin/hiveserver2 启 ...
大数据学习day26----hive01----1hive的简介 2 hive的安装（hive的两种连接方式，后台启动，标准输出，错误输出）3. 数据库的基本操作 4. 建表（内部表和外部表的创建以及应用场景，数据导入，学生、分数sql练习）5.分区表 6加载数据的方式
1. hive的简介(具体见文档) Hive是分析处理结构化数据的工具本质:将hive sql转化成MapReduce程序或者spark程序 Hive处理的数据一般存储在HDFS上,其分析数据底 ...

随机推荐

ogg 监控脚本
section 1: #! /bin/sh PATH=/usr/local/bin:$PATHORACLE_SID=statdb ORAENV_ASK=NO. oraenv > /dev/nul ...
用python编写的excel拆分小工具
from datetime import date,datetime from openpyxl import Workbook from openpyxl import load_workbook ...
vue 模拟后台数据（加载本地json文件）调试
首先创建一个本地json文件,放在项目中如下 { "runRedLight":{ "CurrentPage": 1, "TotalPages" ...
JS执行保存在数据库中的JS代码
function createScript(script) { var myScript = document.createElement("script"); myScript. ...
subprocess模块详解2
1.call() 和run功能类似,都是接受一个列表里的参数. >>> import subprocess >>> a = subprocess.call([&qu ...
nginx,php-fpm的安装配置
在centos7.2的系统下安装nginx和php-fpm nginx 安装 yum install -y nginx 即可完成安装配置由于之前项目使用的是apache,所以项目目录在var/ww ...
js递归和数组去重(简单便捷的用法)
1.递归例子<script type="text/javascript"> function test(num) { if(num < 0) { return; ...
分类IP地址(A、B、C类)的指派范围、一般不使用的特殊IP地址
分类IP地址(A.B.C类)的指派范围.一般不使用的特殊IP地址 A类地址:0开头,8位网络号 B类地址:10开头,16位网络号 C类地址:110开头,24位网络号 D类地址:1110开头,多播地址 ...
PHP环境搭建Zend Studio 10.6.2+WampServer2.4
址:http://www.zend.com/en/products/studio/downloads直接下载地址:http://downloads.zend.com/studio-eclipse/10 ...
Codeforces 1076D——最短路算法
题目给你一个有n个顶点.m条边的无向带权图.需要擦除一些边使得剩余的边数不超过k,如果一个点在原始图到顶点1的最短距离为d,在删边后的图中到顶点的最短距离仍是d,则称这种点是 good.问如何删边, ...