HIVE简单操作

1.hive命令登录HIVE数据库后，执行show databases;命令可以看到hive数据库中有一个默认的default数据库。

[root@hadoop hive]# hive

Logging initialized using configuration in file:/usr/local/hive/conf/hive-log4j2.properties Async: true

Hive-on-MR is deprecated in Hive  and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive .X releases.

hive> show databases;

OK

default #可以看到HIVE默认自带了一个数据库default

Time taken: 21.043 seconds, Fetched:  row(s)

hive>

然后登录mysql数据库，show databases;显示数据库名，可以看到有一个hive数据库；use hive; 进入hive数据库；show tables;显示表名；select * from DBS; #可以看到HIVE默认default数据库的元数据信息。

[root@hadoop ~]# mysql -uroot -proot

Warning: Using a password on the command line interface can be insecure.

Welcome to the MySQL monitor.  Commands end with ; or \g.

Your MySQL connection id is

Server version: 5.6.-log MySQL Community Server (GPL)

Copyright (c) , , Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its

affiliates. Other names may be trademarks of their respective

owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show databases;

+--------------------+

| Database           |

+--------------------+

| information_schema |

| hive               |

| mysql              |

| performance_schema |

| test               |

+--------------------+

 rows in set (0.32 sec)

mysql> use hive

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

Database changed

mysql> show tables;

+---------------------------+

| Tables_in_hive            |

+---------------------------+

| AUX_TABLE                 |

| BUCKETING_COLS            |

| CDS                       |

| COLUMNS_V2                |

| COMPACTION_QUEUE          |

| COMPLETED_COMPACTIONS     |

| COMPLETED_TXN_COMPONENTS  |

| DATABASE_PARAMS           |

| DBS                       |

| DB_PRIVS                  |

| DELEGATION_TOKENS         |

| FUNCS                     |

| FUNC_RU                   |

| GLOBAL_PRIVS              |

| HIVE_LOCKS                |

| IDXS                      |

| INDEX_PARAMS              |

| KEY_CONSTRAINTS           |

| MASTER_KEYS               |

| NEXT_COMPACTION_QUEUE_ID  |

| NEXT_LOCK_ID              |

| NEXT_TXN_ID               |

| NOTIFICATION_LOG          |

| NOTIFICATION_SEQUENCE     |

| NUCLEUS_TABLES            |

| PARTITIONS                |

| PARTITION_EVENTS          |

| PARTITION_KEYS            |

| PARTITION_KEY_VALS        |

| PARTITION_PARAMS          |

| PART_COL_PRIVS            |

| PART_COL_STATS            |

| PART_PRIVS                |

| ROLES                     |

| ROLE_MAP                  |

| SDS                       |

| SD_PARAMS                 |

| SEQUENCE_TABLE            |

| SERDES                    |

| SERDE_PARAMS              |

| SKEWED_COL_NAMES          |

| SKEWED_COL_VALUE_LOC_MAP  |

| SKEWED_STRING_LIST        |

| SKEWED_STRING_LIST_VALUES |

| SKEWED_VALUES             |

| SORT_COLS                 |

| TABLE_PARAMS              |

| TAB_COL_STATS             |

| TBLS                      |

| TBL_COL_PRIVS             |

| TBL_PRIVS                 |

| TXNS                      |

| TXN_COMPONENTS            |

| TYPES                     |

| TYPE_FIELDS               |

| VERSION                   |

| WRITE_SET                 |

+---------------------------+

 rows in set (0.00 sec)

mysql> select * from DBS; #可以看到HIVE默认数据库default的元数据

+-------+-----------------------+----------------------------------------+---------+------------+------------+

| DB_ID | DESC                  | DB_LOCATION_URI                        | NAME    | OWNER_NAME | OWNER_TYPE |

+-------+-----------------------+----------------------------------------+---------+------------+------------+

|      | Default Hive database | hdfs://hadoop:9000/user/hive/warehouse | default | public     | ROLE       |

+-------+-----------------------+----------------------------------------+---------+------------+------------+

 row in set (0.00 sec)

mysql>

2.在hive创建一个测试库

hive> create database testhive; #创建库

OK

Time taken: 3.45 seconds

hive> show databases; #显示库

OK

default

testhive

Time taken: 1.123 seconds, Fetched:  row(s)

在mysql查看，发现显示了测试库元数据信息（包括testhive的DB_ID，在HDFS上的存储位置等）

mysql> select * from DBS;

+-------+-----------------------+----------------------------------------------------+----------+------------+------------+

| DB_ID | DESC                  | DB_LOCATION_URI                                    | NAME     | OWNER_NAME | OWNER_TYPE |

+-------+-----------------------+----------------------------------------------------+----------+------------+------------+

|      | Default Hive database | hdfs://hadoop:9000/user/hive/warehouse             | default  | public     | ROLE       |

|      | NULL                  | hdfs://hadoop:9000/user/hive/warehouse/testhive.db | testhive | root       | USER       |

+-------+-----------------------+----------------------------------------------------+----------+------------+------------+

 rows in set (0.00 sec)

在HDFS查看，我们看一下testhive.db是什么。它其实就是一个目录，所以说创建一个数据库其实就是创建了一个目录

我创建的hdfs目录明明是/usr/hive/warehouse/，不知道为啥数据库却保存到了/user/hive/warehouse/？？哪里出错了？？或者说是我的目录创建错了，应该创建的就是/user/hive/warehouse/？

[root@hadoop ~]# hdfs dfs -ls /user/hive/warehouse

Found  items

drwxr-xr-x   - root supergroup           -- : /user/hive/warehouse/testhive.db

3.创建表

hive> use testhive; #使用库

OK

Time taken: 0.131 seconds

hive> create table test(id int); 创建表

OK

Time taken: 3.509 seconds

在mysql中查看表的信息，可以看到test表归属于DB_ID为6的数据库，即testhive（可 select * from DBS; 查看）

mysql> select * from TBLS;

+--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+

| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME | TBL_TYPE      | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT | IS_REWRITE_ENABLED |

+--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+

|       |   |      |                 | root  |          |      | test     | MANAGED_TABLE | NULL               | NULL               |                    |

+--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+

 row in set (0.01 sec)

在HDFS中查看，发现HDFS为新表创建了一个目录

[root@hadoop ~]# hdfs dfs -ls /user/hive/warehouse/testhive.db

Found  items

drwxr-xr-x   - root supergroup           -- : /user/hive/warehouse/testhive.db/test

4.插入数据。

4.1 在表中插入数据 insert into test values (1); 可以看到系统在对数据进行MapReduce。

hive> insert into test values ();

WARNING: Hive-on-MR is deprecated in Hive  and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive .X releases.

Query ID = root_20180727155527_5971c7d8-9b5c-4ef3-98f7-63febe38c79a

Total jobs =

Launching Job  out of

Number of reduce tasks is set to  since there's no reduce operator

Starting Job = job_1532671010251_0001, Tracking URL = http://hadoop:8088/proxy/application_1532671010251_0001/

Kill Command = /usr/local/hadoop/bin/hadoop job  -kill job_1532671010251_0001

Hadoop job information for Stage-: number of mappers: ; number of reducers:

-- ::, Stage- map = %,  reduce = %, Cumulative CPU 3.32 sec

MapReduce Total cumulative CPU time:  seconds  msec

Ended Job = job_1532671010251_0001

Stage- is selected by condition resolver.

Stage- is filtered out by condition resolver.

Stage- is filtered out by condition resolver.

Moving data to directory hdfs://hadoop:9000/user/hive/warehouse/testhive.db/test/.hive-staging_hive_2018-07-27_15-55-27_353_3121708441542170724-1/-ext-10000

Loading data to table testhive.test

MapReduce Jobs Launched:

Stage-Stage-: Map:    Cumulative CPU: 3.32 sec   HDFS Read:  HDFS Write:  SUCCESS

Total MapReduce CPU Time Spent:  seconds  msec

OK

Time taken: 453.982 seconds

在HDFS查看，发现HDFS将插入的数据封装成了一个文件000000_0

[root@hadoop ~]# hdfs dfs -ls /user/hive/warehouse/testhive.db/test

-rwxr-xr-x    root supergroup           -- : /user/hive/warehouse/testhive.db/test/000000_0

[root@hadoop ~]# hdfs dfs -cat /user/hive/warehouse/testhive.db/test/000000_0

4.2 再插入一个数据 insert into test values (2); 可以看到系统还是在对数据进行MapReduce。

hive>  insert into test values ();

在HDFS中查看，发现HDFS将插入的数据封装成了另外一个文件000000_0_copy_1

[root@hadoop ~]# hdfs dfs -ls /user/hive/warehouse/testhive.db/test

Found  items

-rwxr-xr-x    root supergroup           -- : /user/hive/warehouse/testhive.db/test/000000_0

-rwxr-xr-x    root supergroup           -- : /user/hive/warehouse/testhive.db/test/000000_0_copy_1

[root@hadoop ~]# hdfs dfs -cat /user/hive/warehouse/testhive.db/test/000000_0_copy_1

4.3 再插入一个数据 insert into test values (3); 可以看到系统还是在对数据进行MapReduce。

在HDFS中查看，发现HDFS将插入的数据封装成了另外一个文件000000_0_copy_2

[root@hadoop ~]# hdfs dfs -ls /user/hive/warehouse/testhive.db/test

Found  items

-rwxr-xr-x    root supergroup           -- : /user/hive/warehouse/testhive.db/test/000000_0

-rwxr-xr-x    root supergroup           -- : /user/hive/warehouse/testhive.db/test/000000_0_copy_1

-rwxr-xr-x    root supergroup           -- : /user/hive/warehouse/testhive.db/test/000000_0_copy_2

[root@hadoop ~]# hdfs dfs -cat /user/hive/warehouse/testhive.db/test/000000_0_copy_2

4.4 在hive中查看表

hive> select * from test;

OK

2
3

Time taken: 5.483 seconds, Fetched:  row(s)

5.从本地文件加载数据

先创建文件

[root@hadoop ~]# vi hive.txt  #创建文件

#保存退出

然后加载数据

hive> load data local inpath '/root/hive.txt' into table testhive.test; #加载数据

Loading data to table testhive.test

OK

Time taken: 6.282 seconds

在hive中查看，发现文件内容被映射到了表中的对应的列里

hive> select * from test;

OK

Time taken: 0.534 seconds, Fetched:  row(s)

在HDFS查看，发现hive.txt文件被保存到了test表目录下

[root@hadoop ~]# hdfs dfs -ls /user/hive/warehouse/testhive.db/test

Found  items

-rwxr-xr-x    root supergroup           -- : /user/hive/warehouse/testhive.db/test/000000_0

-rwxr-xr-x    root supergroup           -- : /user/hive/warehouse/testhive.db/test/000000_0_copy_1

-rwxr-xr-x    root supergroup           -- : /user/hive/warehouse/testhive.db/test/000000_0_copy_2

-rwxr-xr-x    root supergroup          -- : /user/hive/warehouse/testhive.db/test/hive.txt

6.hive也支持排序 select * from test order by id desc; 可以看到hive此时也是有一个MapReduce过程

hive> select * from test order by id desc;

WARNING: Hive-on-MR is deprecated in Hive  and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive .X releases.

Query ID = root_20180730093619_c798eb69-b94f--94cc-5ec56865ed5c

Total jobs =

Launching Job  out of

Number of reduce tasks determined at compile time:

In order to change the average load for a reducer (in bytes):

  set hive.exec.reducers.bytes.per.reducer=<number>

In order to limit the maximum number of reducers:

  set hive.exec.reducers.max=<number>

In order to set a constant number of reducers:

  set mapreduce.job.reduces=<number>

Starting Job = job_1532913019648_0001, Tracking URL = http://hadoop:8088/proxy/application_1532913019648_0001/

Kill Command = /usr/local/hadoop/bin/hadoop job  -kill job_1532913019648_0001

Hadoop job information for Stage-: number of mappers: ; number of reducers:

-- ::, Stage- map = %,  reduce = %

-- ::, Stage- map = %,  reduce = %, Cumulative CPU 1.66 sec

-- ::, Stage- map = %,  reduce = %, Cumulative CPU 2.72 sec

-- ::, Stage- map = %,  reduce = %, Cumulative CPU 5.41 sec

MapReduce Total cumulative CPU time:  seconds  msec

Ended Job = job_1532913019648_0001

MapReduce Jobs Launched:

Stage-Stage-: Map:   Reduce:    Cumulative CPU: 5.93 sec   HDFS Read:  HDFS Write:  SUCCESS

Total MapReduce CPU Time Spent:  seconds  msec

OK

Time taken: 224.27 seconds, Fetched:  row(s)

7.hive也支持desc test;

hive> desc test;

OK

id                      int

Time taken: 6.194 seconds, Fetched:  row(s)

hive数据库的操作和mysql其实差不多，它的缺点是没有修改和删除命令，优点是不需要用户亲自写MapReduce，只需要通过简单的sql语句的形式就可以实现复杂关系。

hive的操作还有很多，以后用到再整理吧。

HIVE简单操作的更多相关文章

x01.MagicCube: 简单操作
看最强大脑,发现魔方还是比较好玩的,便买了一个,对照七步还原法,居然也能成功还原. 为什么不写一个魔方程序呢?在网上找了找,略作修改,进行简单操作,还是不错的,其操作代码如下: protected o ...
js简单操作Cookie
贴一段js简单操作Cookie的代码: //获取指定名称的cookie的值 function getCookie(objName) { var arrStr = document.cookie.spl ...
GitHub学习心得之简单操作
作者:枫雪庭出处:http://www.cnblogs.com/FengXueTing-px/ 欢迎转载前言本文对Github的基本操作进行了总结, 主要基于以下文章: http://gitre ...
Linq对XML的简单操作
前两章介绍了关于Linq创建.解析SOAP格式的XML,在实际运用中,可能会对xml进行一些其它的操作,比如基础的增删该查,而操作对象首先需要获取对象,针对于DOM操作来说,Linq确实方便了不少,如 ...
Linux 中 Vi 编辑器的简单操作
Linux 中 Vi 编辑器的简单操作 Vi 编辑器一共有3种模式:命名模式(默认),尾行模式,编辑模式.3种模式彼此需要切换. 一.进入 Vi 编辑器的的命令 vi filename //打开或新 ...
python（pymysql）之mysql简单操作
一.mysql简单介绍说到数据库,我们大多想到的是关系型数据库,比如mysql.oracle.sqlserver等等,这些数据库软件在windows上安装都非常的方便,在Linux上如果要安装数据库 ...
ZooKeeper系列3：ZooKeeper命令、命令行工具及简单操作
问题导读1.ZooKeeper包含哪些常用命令?2.通过什么命令可以列出服务器 watch 的详细信息?3.ZooKeeper包含哪些操作?4.ZooKeeper如何创建zookeeper? 常用命令 ...
ORACLE的安装与网页版创建表空间的简单操作以及PLsql的简单操作
1.oracle的安装: 安装简单易学,在这里不做解释.下载看装包后耐心等待,注意安装目录不要有中文字符,尽量按照指定目录进行安装.安装完成后会占用有大约5g的内存. 如果要卸载oracle,需要用其 ...
spark使用Hive表操作
spark Hive表操作之前很长一段时间是通过hiveServer操作Hive表的,一旦hiveServer宕掉就无法进行操作. 比如说一个修改表分区的操作一.使用HiveServer的方式 v ...

随机推荐

修改torndb库为依赖pymysql，使其适应python3,一个更简单的操作数据库的类。
1.python的MySQLdb和pymysql是两个基本数据库操作包,MySQLdb安装很麻烦,要有c++相关环境,python3也安装不了. python3一般安装pymysql,此包与MySQL ...
CentOS使用virt-what知道虚拟机的虚拟化技术
通常拿到一台vps,提供商可能不会告诉我们具体的虚拟化技术,对于CentOS的系统的vm,可以使用virt-what来知道. 如果提示virt-what命令找不到,则需要安装一下 yum instal ...
ios开发之--关于UIView的autoresizingMask属性的研究
在 UIView 中有一个autoresizingMask的属性,它对应的是一个枚举的值(如下),属性的意思就是自动调整子控件与父控件中间的位置,宽高. enum { UIViewAutoresizi ...
数据结构与算法——基数排序简单Java实现
基数排序(radix sort)又称“桶子法”,在对多个正整数进行排序时可以使用.它的灵感来自于队列(Queue),它最独特的地方在于利用了数字的有穷性(阿拉伯数字只有0到9的10个). 基数排序使用 ...
Tomcat服务器使用和debug
1 在写程序的过程中,遇到了tomcat服务器不能重启的情况,要排查出这个错误并解决它. tomcat就像一棵树,我不能对书上的每片叶子的纹理都熟悉,我只能看到树的轮廓.好像之前出现过这个问题,在se ...
I - 取石子游戏
有两堆石子,数量任意,可以不同.游戏开始由两个人轮流取石子.游戏规定,每次有两种不同的取法,一是可以在任意的一堆中取走任意多的石子:二是可以在两堆中同时取走相同数量的石子.最后把石子全部取完者为胜者. ...
windows查看注册表
首先win+r打开程序搜索框输入regedit 然后编辑==>查找
[No0000180]改善C#程序的建议8：避免锁定不恰当的同步对象
在C#中让线程同步的另一种编码方式就是使用线程锁.所谓线程锁,就是锁住一个资源,使得应用程序只能在此刻有一个线程访问该资源.可以用下面这句不是那么贴切的话来理解线程锁的作用:锁,就是让多线程变成单线程 ...
1.7Oo局部变量和成员变量执行顺序
import java.util.Scanner; public class booleann { private float fWidth; private float fHeight; void ...
ArcGIS拓扑检查
对于拓扑检查中的等级参数一直不理解,经过参考资料才明白过来: 注:如果有两个要素参与到拓扑,在修复拓扑错误时会优先移动拓扑级别低的要素来满足匹配拓扑规则要求. 参考资料: https://wenku. ...

HIVE简单操作

HIVE简单操作的更多相关文章

随机推荐

热门专题