Hive之 hive的三种使用方式（CLI、HWI、Thrift）

Hive有三种使用方式——CLI命令行，HWI（hie web interface）浏览器以及 Thrift客户端连接方式。

1、hive 命令行模式

直接输入/hive/bin/hive的执行程序，或者输入 hive –service cli
用于linux平台命令行查询，查询语句基本跟MySQL查询语句类似

2、hive web界面的启动方式

hive –service hwi 用于通过浏览器来访问hive

如果lib目录下没有hive-hwi-{version}.war包，我们要自己打包

官网下载源码包（比如1.10版本）

解压

$ tar zxvf apache-hive-1.1.0.src.tar.gz

再进入 hwi 目录，打包 war 文件（注意命令末尾有一个点.）

#cd apache-hive-1.1.0-src/hwi

#jar cvfM0 hive-hwi-1.1.0.war -C web/ .

打包完成后，有了我们需要的 war 文件，再复制到 $HIVE_HOME/lib 目录下

#cp hive-hwi-1.1.0.war /usr/local/hive-1.1.0/lib

另外我们还需要拷贝一个 Java 的 tools.jar 到 $HIVE_HOME/lib 目录下

 cp /usr/local/jdk1.7.0_67/lib/tools.jar /usr/local/hive-1.1.0/lib

否则会出现类似于下面的错误（因为 JAVA_HOME 指到$JAVA_HOME/jre 下了，而其 lib下的 tools.jar 跟$JAVA_HOME/lib/tools.jar 不一样，编译的时候需要用到后者）

最后，我们将 hive-site.xml 文件修改为

<property>

    <name>hive.hwi.listen.host</name>

    <value>0.0.0.0</value>

    <description>监听的地址</description>

  </property>

  <property>

    <name>hive.hwi.listen.port</name>

    <value>9999</value>

    <description>监听的端口号</description>

  </property>

  <property>

    <name>hive.hwi.war.file</name>

    <value>lib/hive-hwi-1.1.0.war</value>

    <description>war包所在的地址，注意这里不支持绝对路径，坑！</description>

  </property>

启动 hwi
在 $HIVE_HOME/bin 目录下，启动 hwi（由于我们之前已经修改了 Derby 为 MySQL 数据库，所以在启动 hwi 之前，请确保 MySQL 和 Hadoop 已经成功启动）：

nohup bin/hive --service hwi > /dev/null 2> /dev/null &

web访问

我们可以在浏览器中打开网络接口的地址：localhost:9999/hwi, 启动成功

3、jdbc远程连接hiveserver2

在之前的学习和实践Hive中，使用的都是CLI或者hive –e的方式，该方式仅允许使用HiveQL执行查询、更新等操作，并且该方式比较笨拙单一。幸好Hive提供了轻客户端的实现，通过HiveServer或者HiveServer2，客户端可以在不启动CLI的情况下对Hive中的数据进行操作，两者都允许远程客户端使用多种编程语言如Java、Python向Hive提交请求，取回结果。HiveServer或者HiveServer2都是基于Thrift的，但HiveSever有时被称为Thrift
server，而HiveServer2却不会。既然已经存在HiveServer为什么还需要HiveServer2呢？这是因为HiveServer不能处理多于一个客户端的并发请求，这是由于HiveServer使用的Thrift接口所导致的限制，不能通过修改HiveServer的代码修正。因此在Hive-0.11.0版本中重写了HiveServer代码得到了HiveServer2，进而解决了该问题。HiveServer2支持多客户端的并发和认证，为开放API客户端如JDBC、ODBC提供了更好的支持。

配置

<property>

  <name>hive.metastore.warehouse.dir</name>

  <value>/usr/hive/warehouse</value>               //（hive中的数据库和表在HDFS中存放的文件夹的位置）

  <description>location of default database for the warehouse</description>

</property>

<property>

  <name>hive.server2.thrift.port</name>

  <value>10000</value>                               //（HiveServer2远程连接的端口，默认为10000）

  <description>Port number of HiveServer2 Thrift interface.

  Can be overridden by setting $HIVE_SERVER2_THRIFT_PORT</description>

</property>

<property>

  <name>hive.server2.thrift.bind.host</name>

  <value>**.**.**.**</value>                          //（hive所在集群的IP地址）

  <description>Bind host on which to run the HiveServer2 Thrift interface.

  Can be overridden by setting $HIVE_SERVER2_THRIFT_BIND_HOST</description>

</property>

<property>

  <name>hive.server2.long.polling.timeout</name>

  <value>5000</value>                                // (默认为5000L,此处修改为5000，不然程序会报错)

  <description>Time in milliseconds that HiveServer2 will wait, before responding to asynchronous calls that use long polling</description>

</property>

<property>

  <name>javax.jdo.option.ConnectionURL</name>

  <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>  //（Hive的元数据库，我采用的是本地Mysql作为元数据库）

  <description>JDBC connect string for a JDBC metastore</description>

</property>

<property>

  <name>javax.jdo.option.ConnectionDriverName</name>          //（连接元数据的驱动名）

  <value>com.mysql.jdbc.Driver</value>

  <description>Driver class name for a JDBC metastore</description>

</property>

<property>

  <name>javax.jdo.option.ConnectionUserName</name>             //（连接元数据库用户名）

  <value>hive</value>

  <description>username to use against metastore database</description>

</property>

<property>

  <name>javax.jdo.option.ConnectionPassword</name>             // （连接元数据库密码）

  <value>hive</value>

  <description>password to use against metastore database</description>

</property>

先启动元数据库，在命令行中键入：hive --service metastore &

接下来开启hiveserver2服务：
在命令行中键入：hive --service hiveserver2 &
注意查看日志是否报错。

javaapi操作hive实例

package com.berg.hive.test1.api;

import java.sql.Connection;

import java.sql.DriverManager;

import java.sql.ResultSet;

import java.sql.SQLException;

import java.sql.Statement;  

import org.apache.log4j.Logger;  

/**

 * Hive的JavaApi

 *

 * 启动hive的远程服务接口命令行执行：hive --service hiveserver &

 *

 * @author 汤高

 *

 */

public class HiveJdbcCli {

    //网上写 org.apache.hadoop.hive.jdbc.HiveDriver ,新版本不能这样写

    private static String driverName = "org.apache.hive.jdbc.HiveDriver";  

  //这里是hive2，网上其他人都写hive,在高版本中会报错

    private static String url = "jdbc:hive2://master:10000/default";

    private static String user = "hive";

    private static String password = "hive";

    private static String sql = "";

    private static ResultSet res;

    private static final Logger log = Logger.getLogger(HiveJdbcCli.class);  

    public static void main(String[] args) {

        Connection conn = null;

        Statement stmt = null;

        try {

            conn = getConn();

            stmt = conn.createStatement();  

            // 第一步:存在就先删除

            String tableName = dropTable(stmt);  

            // 第二步:不存在就创建

            createTable(stmt, tableName);  

            // 第三步:查看创建的表

            showTables(stmt, tableName);  

            // 执行describe table操作

            describeTables(stmt, tableName);  

            // 执行load data into table操作

            loadData(stmt, tableName);  

            // 执行 select * query 操作

            selectData(stmt, tableName);  

            // 执行 regular hive query 统计操作

            countData(stmt, tableName);  

        } catch (ClassNotFoundException e) {

            e.printStackTrace();

            log.error(driverName + " not found!", e);

            System.exit(1);

        } catch (SQLException e) {

            e.printStackTrace();

            log.error("Connection error!", e);

            System.exit(1);

        } finally {

            try {

                if (conn != null) {

                    conn.close();

                    conn = null;

                }

                if (stmt != null) {

                    stmt.close();

                    stmt = null;

                }

            } catch (SQLException e) {

                e.printStackTrace();

            }

        }

    }  

    private static void countData(Statement stmt, String tableName)

            throws SQLException {

        sql = "select count(1) from " + tableName;

        System.out.println("Running:" + sql);

        res = stmt.executeQuery(sql);

        System.out.println("执行“regular hive query”运行结果:");

        while (res.next()) {

            System.out.println("count ------>" + res.getString(1));

        }

    }  

    private static void selectData(Statement stmt, String tableName)

            throws SQLException {

        sql = "select * from " + tableName;

        System.out.println("Running:" + sql);

        res = stmt.executeQuery(sql);

        System.out.println("执行 select * query 运行结果:");

        while (res.next()) {

            System.out.println(res.getInt(1) + "\t" + res.getString(2));

        }

    }  

    private static void loadData(Statement stmt, String tableName)

            throws SQLException {

        //目录 ，我的是hive安装的机子的虚拟机的home目录下

        String filepath = "user.txt";

        sql = "load data local inpath '" + filepath + "' into table "

                + tableName;

        System.out.println("Running:" + sql);

         stmt.execute(sql);

    }  

    private static void describeTables(Statement stmt, String tableName)

            throws SQLException {

        sql = "describe " + tableName;

        System.out.println("Running:" + sql);

        res = stmt.executeQuery(sql);

        System.out.println("执行 describe table 运行结果:");

        while (res.next()) {

            System.out.println(res.getString(1) + "\t" + res.getString(2));

        }

    }  

    private static void showTables(Statement stmt, String tableName)

            throws SQLException {

        sql = "show tables '" + tableName + "'";

        System.out.println("Running:" + sql);

        res = stmt.executeQuery(sql);

        System.out.println("执行 show tables 运行结果:");

        if (res.next()) {

            System.out.println(res.getString(1));

        }

    }  

    private static void createTable(Statement stmt, String tableName)

            throws SQLException {

        sql = "create table "

                + tableName

                + " (key int, value string)  row format delimited fields terminated by '\t'";

        stmt.execute(sql);

    }  

    private static String dropTable(Statement stmt) throws SQLException {

        // 创建的表名

        String tableName = "testHive";

        sql = "drop table  " + tableName;

        stmt.execute(sql);

        return tableName;

    }  

    private static Connection getConn() throws ClassNotFoundException,

            SQLException {

        Class.forName(driverName);

        Connection conn = DriverManager.getConnection(url, user, password);

        return conn;

    }  

}

Hive之 hive的三种使用方式（CLI、HWI、Thrift）的更多相关文章

Hive metastore三种配置方式
http://blog.csdn.net/reesun/article/details/8556078 Hive的meta数据支持以下三种存储方式,其中两种属于本地存储,一种为远端存储.远端存储比较适 ...
Hive的三种Join方式
Hive的三种Join方式 hive Hive中就是把Map,Reduce的Join拿过来,通过SQL来表示. 参考链接:https://cwiki.apache.org/confluence/dis ...
Hive设置参数的三种方法
Hive提供三种可以改变环境变量的方法,分别是:(1).修改${HIVE_HOME}/conf/hive-site.xml配置文件:(2).命令行参数:(3).在已经进入cli时进行参数声明.下面分别 ...
通过三个DEMO学会SignalR的三种实现方式
一.理解SignalR ASP .NET SignalR 是一个ASP .NET 下的类库,可以在ASP .NET 的Web项目中实现实时通信(即:客户端(Web页面)和服务器端可以互相实时的通知消息 ...
django 模板语法和三种返回方式
模板 for循环 {% for athlete in athlete_list %} <li>{{ athlete.name }}</li> {% endfor %} if语句 ...
js的三种继承方式及其优缺点
[转] 第一种,prototype的方式: //父类 function person(){ this.hair = 'black'; this.eye = 'black'; this.skin = ' ...
spring ioc三种注入方式
spring ioc三种注入方式 IOC ,全称 (Inverse Of Control) ,中文意思为:控制反转什么是控制反转? 控制反转是一种将组件依赖关系的创建和管理置于程序外部的技术. 由容 ...
Map三种遍历方式
Map三种遍历方式 package decorator; import java.util.Collection; import java.util.HashMap; import java.util ...
php 递归函数的三种实现方式
递归函数是我们常用到的一类函数,最基本的特点是函数自身调用自身,但必须在调用自身前有条件判断,否则无限无限调用下去.实现递归函数可以采取什么方式呢?本文列出了三种基本方式.理解其原来需要一定的基础知识 ...
JSON的三种解析方式
一.什么是JSON? JSON是一种取代XML的数据结构,和xml相比,它更小巧但描述能力却不差,由于它的小巧所以网络传输数据将减少更多流量从而加快速度. JSON就是一串字符串只不过元素会使用特定 ...

随机推荐

Fms3和Flex打造在线多人视频会议和视频聊天(附原代码)
Flex,Fms3系列文章导航 Flex,Fms3相关文章索引本篇是视频聊天,会议开发实例系列文章的第3篇,该系列所有文章链接如下: http://www.cnblogs.com/aierong/a ...
NET Framework 4.0无法安装!
win7旗舰版无法安装CAD2012,安装NET Framework 4.0的时候就出现错误,安装NET Framework 4.0单独版也无法安装出现错误. 解决方法: 1.点击电脑桌面右下角的“开 ...
TED #04#
Christopher Ategeka: How adoption worked for me 1. I experienced all the negative effects of poverty ...
pyDay9
内容来自廖雪峰的官方网站. generator 1.引入generator的原因. 通过列表生成式,我们可以直接创建一个列表.但是,受到内存限制,列表容量肯定是有限的.而且,创建一个包含100万个元素 ...
SQL学习笔记三(补充-3)之MySQL完整性约束
阅读目录一介绍二 not null与default 三 unique 四 primary key 五 auto_increment 六 foreign key 七作业一介绍约束条件与数据 ...
Python3.x：bs4解析html基础用法
Python3.x:bs4解析html基础用法代码: import urllib.request from bs4 import BeautifulSoup import re url = r'ht ...
从0开始学习 GITHUB 系列之「初识 GITHUB」【转】
本文转载自:http://stormzhang.com/github/2016/05/25/learn-github-from-zero1/ 版权声明:本文为 stormzhang 原创文章,可以随意 ...
linux kernel 提示VFS: Cannot open root device "mmcblk0p1" or unknown-block(179,1): error -19等信息后发生panic
一.背景文件系统安装在sd卡的第一个分区中,使用的是ext4文件系统,linux内核版本为4.14 二.思考在内核启动之前,uboot给内核传递了参数root=/dev/mmcblk0p1,但是为 ...
sqlite的时间筛选字段
唉,需要不停的踩坑呀 commandText = commandText + string.Format("where [CollectDateTime] <'{0}' and [Co ...
linux中find与rm实现查找并删除目录或文件
linux 下用find命令查找文件,rm命令删除文件. 删除指定目录下指定文件find 要查找的目录名 -name .svn |xargs rm -rf 删除指定名称的文件或文件夹: find -t ...

Hive之 hive的三种使用方式（CLI、HWI、Thrift）

Hive之 hive的三种使用方式（CLI、HWI、Thrift）的更多相关文章

随机推荐

热门专题