HIVE简单操作
1.hive命令登录HIVE数据库后,执行show databases;命令可以看到hive数据库中有一个默认的default数据库。
[root@hadoop hive]# hive Logging initialized using configuration in file:/usr/local/hive/conf/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive .X releases.
hive> show databases;
OK
default #可以看到HIVE默认自带了一个数据库default
Time taken: 21.043 seconds, Fetched: row(s)
hive>
然后登录mysql数据库,show databases;显示数据库名,可以看到有一个hive数据库;use hive; 进入hive数据库;show tables;显示表名;select * from DBS; #可以看到HIVE默认default数据库的元数据信息。
[root@hadoop ~]# mysql -uroot -proot
Warning: Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is
Server version: 5.6.-log MySQL Community Server (GPL) Copyright (c) , , Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| hive |
| mysql |
| performance_schema |
| test |
+--------------------+
rows in set (0.32 sec) mysql> use hive
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A Database changed
mysql> show tables;
+---------------------------+
| Tables_in_hive |
+---------------------------+
| AUX_TABLE |
| BUCKETING_COLS |
| CDS |
| COLUMNS_V2 |
| COMPACTION_QUEUE |
| COMPLETED_COMPACTIONS |
| COMPLETED_TXN_COMPONENTS |
| DATABASE_PARAMS |
| DBS |
| DB_PRIVS |
| DELEGATION_TOKENS |
| FUNCS |
| FUNC_RU |
| GLOBAL_PRIVS |
| HIVE_LOCKS |
| IDXS |
| INDEX_PARAMS |
| KEY_CONSTRAINTS |
| MASTER_KEYS |
| NEXT_COMPACTION_QUEUE_ID |
| NEXT_LOCK_ID |
| NEXT_TXN_ID |
| NOTIFICATION_LOG |
| NOTIFICATION_SEQUENCE |
| NUCLEUS_TABLES |
| PARTITIONS |
| PARTITION_EVENTS |
| PARTITION_KEYS |
| PARTITION_KEY_VALS |
| PARTITION_PARAMS |
| PART_COL_PRIVS |
| PART_COL_STATS |
| PART_PRIVS |
| ROLES |
| ROLE_MAP |
| SDS |
| SD_PARAMS |
| SEQUENCE_TABLE |
| SERDES |
| SERDE_PARAMS |
| SKEWED_COL_NAMES |
| SKEWED_COL_VALUE_LOC_MAP |
| SKEWED_STRING_LIST |
| SKEWED_STRING_LIST_VALUES |
| SKEWED_VALUES |
| SORT_COLS |
| TABLE_PARAMS |
| TAB_COL_STATS |
| TBLS |
| TBL_COL_PRIVS |
| TBL_PRIVS |
| TXNS |
| TXN_COMPONENTS |
| TYPES |
| TYPE_FIELDS |
| VERSION |
| WRITE_SET |
+---------------------------+
rows in set (0.00 sec) mysql> select * from DBS; #可以看到HIVE默认数据库default的元数据
+-------+-----------------------+----------------------------------------+---------+------------+------------+
| DB_ID | DESC | DB_LOCATION_URI | NAME | OWNER_NAME | OWNER_TYPE |
+-------+-----------------------+----------------------------------------+---------+------------+------------+
| | Default Hive database | hdfs://hadoop:9000/user/hive/warehouse | default | public | ROLE |
+-------+-----------------------+----------------------------------------+---------+------------+------------+
row in set (0.00 sec) mysql>
2.在hive创建一个测试库
hive> create database testhive; #创建库
OK
Time taken: 3.45 seconds hive> show databases; #显示库
OK
default
testhive
Time taken: 1.123 seconds, Fetched: row(s)
在mysql查看,发现显示了测试库元数据信息(包括testhive的DB_ID,在HDFS上的存储位置等 )
mysql> select * from DBS;
+-------+-----------------------+----------------------------------------------------+----------+------------+------------+
| DB_ID | DESC | DB_LOCATION_URI | NAME | OWNER_NAME | OWNER_TYPE |
+-------+-----------------------+----------------------------------------------------+----------+------------+------------+
| | Default Hive database | hdfs://hadoop:9000/user/hive/warehouse | default | public | ROLE |
| | NULL | hdfs://hadoop:9000/user/hive/warehouse/testhive.db | testhive | root | USER |
+-------+-----------------------+----------------------------------------------------+----------+------------+------------+
rows in set (0.00 sec)
在HDFS查看,我们看一下testhive.db是什么。它其实就是一个目录,所以说创建一个数据库其实就是创建了一个目录
我创建的hdfs目录明明是/usr/hive/warehouse/,不知道为啥数据库却保存到了/user/hive/warehouse/??哪里出错了??或者说是我的目录创建错了,应该创建的就是/user/hive/warehouse/?
[root@hadoop ~]# hdfs dfs -ls /user/hive/warehouse
Found items
drwxr-xr-x - root supergroup -- : /user/hive/warehouse/testhive.db
3.创建表
hive> use testhive; #使用库
OK
Time taken: 0.131 seconds hive> create table test(id int); 创建表
OK
Time taken: 3.509 seconds
在mysql中查看表的信息,可以看到test表归属于DB_ID为6的数据库,即testhive(可 select * from DBS; 查看)
mysql> select * from TBLS;
+--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME | TBL_TYPE | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT | IS_REWRITE_ENABLED |
+--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+
| | | | | root | | | test | MANAGED_TABLE | NULL | NULL | |
+--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+
row in set (0.01 sec)
在HDFS中查看,发现HDFS为新表创建了一个目录
[root@hadoop ~]# hdfs dfs -ls /user/hive/warehouse/testhive.db
Found items
drwxr-xr-x - root supergroup -- : /user/hive/warehouse/testhive.db/test
4.插入数据。
4.1 在表中插入数据 insert into test values (1); 可以看到系统在对数据进行MapReduce。
hive> insert into test values ();
WARNING: Hive-on-MR is deprecated in Hive and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive .X releases.
Query ID = root_20180727155527_5971c7d8-9b5c-4ef3-98f7-63febe38c79a
Total jobs =
Launching Job out of
Number of reduce tasks is set to since there's no reduce operator
Starting Job = job_1532671010251_0001, Tracking URL = http://hadoop:8088/proxy/application_1532671010251_0001/
Kill Command = /usr/local/hadoop/bin/hadoop job -kill job_1532671010251_0001
Hadoop job information for Stage-: number of mappers: ; number of reducers:
-- ::, Stage- map = %, reduce = %, Cumulative CPU 3.32 sec
MapReduce Total cumulative CPU time: seconds msec
Ended Job = job_1532671010251_0001
Stage- is selected by condition resolver.
Stage- is filtered out by condition resolver.
Stage- is filtered out by condition resolver.
Moving data to directory hdfs://hadoop:9000/user/hive/warehouse/testhive.db/test/.hive-staging_hive_2018-07-27_15-55-27_353_3121708441542170724-1/-ext-10000
Loading data to table testhive.test
MapReduce Jobs Launched:
Stage-Stage-: Map: Cumulative CPU: 3.32 sec HDFS Read: HDFS Write: SUCCESS
Total MapReduce CPU Time Spent: seconds msec
OK
Time taken: 453.982 seconds
在HDFS查看,发现HDFS将插入的数据封装成了一个文件000000_0
[root@hadoop ~]# hdfs dfs -ls /user/hive/warehouse/testhive.db/test
-rwxr-xr-x root supergroup -- : /user/hive/warehouse/testhive.db/test/000000_0
[root@hadoop ~]# hdfs dfs -cat /user/hive/warehouse/testhive.db/test/000000_0
4.2 再插入一个数据 insert into test values (2); 可以看到系统还是在对数据进行MapReduce。
hive> insert into test values ();
在HDFS中查看,发现HDFS将插入的数据封装成了另外一个文件000000_0_copy_1
[root@hadoop ~]# hdfs dfs -ls /user/hive/warehouse/testhive.db/test
Found items
-rwxr-xr-x root supergroup -- : /user/hive/warehouse/testhive.db/test/000000_0
-rwxr-xr-x root supergroup -- : /user/hive/warehouse/testhive.db/test/000000_0_copy_1
[root@hadoop ~]# hdfs dfs -cat /user/hive/warehouse/testhive.db/test/000000_0_copy_1
4.3 再插入一个数据 insert into test values (3); 可以看到系统还是在对数据进行MapReduce。
在HDFS中查看,发现HDFS将插入的数据封装成了另外一个文件000000_0_copy_2
[root@hadoop ~]# hdfs dfs -ls /user/hive/warehouse/testhive.db/test
Found items
-rwxr-xr-x root supergroup -- : /user/hive/warehouse/testhive.db/test/000000_0
-rwxr-xr-x root supergroup -- : /user/hive/warehouse/testhive.db/test/000000_0_copy_1
-rwxr-xr-x root supergroup -- : /user/hive/warehouse/testhive.db/test/000000_0_copy_2
[root@hadoop ~]# hdfs dfs -cat /user/hive/warehouse/testhive.db/test/000000_0_copy_2
4.4 在hive中查看表
hive> select * from test;
OK 2
3
Time taken: 5.483 seconds, Fetched: row(s)
5.从本地文件加载数据
先创建文件
[root@hadoop ~]# vi hive.txt #创建文件 #保存退出
然后加载数据
hive> load data local inpath '/root/hive.txt' into table testhive.test; #加载数据
Loading data to table testhive.test
OK
Time taken: 6.282 seconds
在hive中查看,发现文件内容被映射到了表中的对应的列里
hive> select * from test;
OK Time taken: 0.534 seconds, Fetched: row(s)
在HDFS查看,发现hive.txt文件被保存到了test表目录下
[root@hadoop ~]# hdfs dfs -ls /user/hive/warehouse/testhive.db/test
Found items
-rwxr-xr-x root supergroup -- : /user/hive/warehouse/testhive.db/test/000000_0
-rwxr-xr-x root supergroup -- : /user/hive/warehouse/testhive.db/test/000000_0_copy_1
-rwxr-xr-x root supergroup -- : /user/hive/warehouse/testhive.db/test/000000_0_copy_2
-rwxr-xr-x root supergroup -- : /user/hive/warehouse/testhive.db/test/hive.txt
6.hive也支持排序 select * from test order by id desc; 可以看到hive此时也是有一个MapReduce过程
hive> select * from test order by id desc;
WARNING: Hive-on-MR is deprecated in Hive and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive .X releases.
Query ID = root_20180730093619_c798eb69-b94f--94cc-5ec56865ed5c
Total jobs =
Launching Job out of
Number of reduce tasks determined at compile time:
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1532913019648_0001, Tracking URL = http://hadoop:8088/proxy/application_1532913019648_0001/
Kill Command = /usr/local/hadoop/bin/hadoop job -kill job_1532913019648_0001
Hadoop job information for Stage-: number of mappers: ; number of reducers:
-- ::, Stage- map = %, reduce = %
-- ::, Stage- map = %, reduce = %, Cumulative CPU 1.66 sec
-- ::, Stage- map = %, reduce = %, Cumulative CPU 2.72 sec
-- ::, Stage- map = %, reduce = %, Cumulative CPU 5.41 sec
MapReduce Total cumulative CPU time: seconds msec
Ended Job = job_1532913019648_0001
MapReduce Jobs Launched:
Stage-Stage-: Map: Reduce: Cumulative CPU: 5.93 sec HDFS Read: HDFS Write: SUCCESS
Total MapReduce CPU Time Spent: seconds msec
OK Time taken: 224.27 seconds, Fetched: row(s)
7.hive也支持desc test;
hive> desc test;
OK
id int
Time taken: 6.194 seconds, Fetched: row(s)
hive数据库的操作和mysql其实差不多,它的缺点是没有修改和删除命令,优点是不需要用户亲自写MapReduce,只需要通过简单的sql语句的形式就可以实现复杂关系。
hive的操作还有很多,以后用到再整理吧。
HIVE简单操作的更多相关文章
- x01.MagicCube: 简单操作
看最强大脑,发现魔方还是比较好玩的,便买了一个,对照七步还原法,居然也能成功还原. 为什么不写一个魔方程序呢?在网上找了找,略作修改,进行简单操作,还是不错的,其操作代码如下: protected o ...
- js简单操作Cookie
贴一段js简单操作Cookie的代码: //获取指定名称的cookie的值 function getCookie(objName) { var arrStr = document.cookie.spl ...
- GitHub学习心得之 简单操作
作者:枫雪庭 出处:http://www.cnblogs.com/FengXueTing-px/ 欢迎转载 前言 本文对Github的基本操作进行了总结, 主要基于以下文章: http://gitre ...
- Linq对XML的简单操作
前两章介绍了关于Linq创建.解析SOAP格式的XML,在实际运用中,可能会对xml进行一些其它的操作,比如基础的增删该查,而操作对象首先需要获取对象,针对于DOM操作来说,Linq确实方便了不少,如 ...
- Linux 中 Vi 编辑器的简单操作
Linux 中 Vi 编辑器的简单操作 Vi 编辑器一共有3种模式:命名模式(默认),尾行模式,编辑模式.3种模式彼此需要切换. 一.进入 Vi 编辑器的的命令 vi filename //打开或新 ...
- python(pymysql)之mysql简单操作
一.mysql简单介绍 说到数据库,我们大多想到的是关系型数据库,比如mysql.oracle.sqlserver等等,这些数据库软件在windows上安装都非常的方便,在Linux上如果要安装数据库 ...
- ZooKeeper系列3:ZooKeeper命令、命令行工具及简单操作
问题导读1.ZooKeeper包含哪些常用命令?2.通过什么命令可以列出服务器 watch 的详细信息?3.ZooKeeper包含哪些操作?4.ZooKeeper如何创建zookeeper? 常用命令 ...
- ORACLE的安装与网页版创建表空间的简单操作以及PLsql的简单操作
1.oracle的安装: 安装简单易学,在这里不做解释.下载看装包后耐心等待,注意安装目录不要有中文字符,尽量按照指定目录进行安装.安装完成后会占用有大约5g的内存. 如果要卸载oracle,需要用其 ...
- spark使用Hive表操作
spark Hive表操作 之前很长一段时间是通过hiveServer操作Hive表的,一旦hiveServer宕掉就无法进行操作. 比如说一个修改表分区的操作 一.使用HiveServer的方式 v ...
随机推荐
- Kubernetes部署SpringCloud(三) 使用 Ingress-nginx 暴露服务
之前部署的zuul以及basic-info-api 都仅仅在于flannel 网络内可以访问. 现在来使用Ingress-nginx 对外暴露服务 以下用到的一些docker镜像,是存在我私有仓库的, ...
- [原]Jenkins(十)---jenkins注册管理员admin并赋所有权限给admin
/** * lihaibo * 文章内容都是根据自己工作情况实践得出. * 版权声明:本博客欢迎转发,但请保留原作者信息! http://www.cnblogs.com/horizonli/p/533 ...
- Tomcat服务器使用和debug
1 在写程序的过程中,遇到了tomcat服务器不能重启的情况,要排查出这个错误并解决它. tomcat就像一棵树,我不能对书上的每片叶子的纹理都熟悉,我只能看到树的轮廓.好像之前出现过这个问题,在se ...
- AWS EC2 使用root账户密码登陆
创建亚马逊的云主机EC2会提示下载一个pem的文件,需要使用puttygen转换成ppk私钥,转换过程如下图: 然后在使用putty登录,用户名是ec2-user.下面将修改使用root账户登录: 1 ...
- MySQL使用mysqldump备份及还原
MySQL可以使用mysqldump进行数据的逻辑备份,配合开启bin log日志可以实现数据的全量恢复及增量恢复 MySQL版本查看 修改配置文件记录bin log日志 [mysqld] #bin ...
- np.mgird np.ogrid
np.ogrid: address:https://docs.scipy.org/doc/numpy/reference/generated/numpy.ogrid.html returns an o ...
- 京东无人超市的成长之路 如何利用AI技术在零售业做产品创新?
随着消费及用户体验的需求升级.人货场的运营效率需求提升.人工智能技术的突破以及零售基础设施的变革等因素共同推动了第四次零售革命的到来,不仅在国内,国外一线巨头互联网亚马逊等企业都在研发无人驾驶.无人超 ...
- HDU 5954 - Do not pour out - [积分+二分][2016ACM/ICPC亚洲区沈阳站 Problem G]
题目链接:http://acm.hdu.edu.cn/showproblem.php?pid=5954 Problem DescriptionYou have got a cylindrical cu ...
- Will vs Be Going To vs Present Continuous: Talk About the Future in English
https://www.youtube.com/watch?v=UISiuiPd_FY will 说话的当下决定的将来要做什么,in the moment be going to 有意图去做,但没有计 ...
- php之memcached存储session配置、存储、获取
[session] ①.session.save_handler = memcache session.save_handler 定义了来存储和获取与会话关联的数据的处理器的名字,默认是files ② ...