使用spark将内存中的数据写入到hive表中

hive-site.xml

<?xml version="1.0" encoding="UTF-8" standalone="no"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!--

   Licensed to the Apache Software Foundation (ASF) under one or more

   contributor license agreements.  See the NOTICE file distributed with

   this work for additional information regarding copyright ownership.

   The ASF licenses this file to You under the Apache License, Version 2.0

   (the "License"); you may not use this file except in compliance with

   the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software

   distributed under the License is distributed on an "AS IS" BASIS,

   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

   See the License for the specific language governing permissions and

   limitations under the License.

-->

<configuration>

    <!--hive 的元数据服务, 供spark SQL 使用-->

    <property>

        　　　　<name>hive.metastore.uris</name>

        　　　　<value>thrift://master:9083</value>

        　　　　<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>

        　　</property>

    <!--配置mysql数据库的链接URL和数据库名metastore,?后面的表达式代表如果这个数据库

    不存在,会自动创建-->

    <property>

        <name>javax.jdo.option.ConnectionURL</name>

        <value>jdbc:mysql://master:3306/metastore?createDatabaseIfNotExist=true</value>

        <description>JDBC connect string for a JDBC metastore</description>

    </property>

    <!--指定mysql的链接驱动,配置jdbc的驱动-->

    <property>

        <name>javax.jdo.option.ConnectionDriverName</name>

        <value>com.mysql.jdbc.Driver</value>

        <description>Driver class name for a JDBC metastore</description>

    </property>

    <!--配置mysql的用户名和密码-->

    <property>

        <name>javax.jdo.option.ConnectionUserName</name>

        <value>root</value>

        <description>username to use against metastore database</description>

    </property>

    <property>

        <name>javax.jdo.option.ConnectionPassword</name>

        <value>123456</value>

        <description>password to use against metastore database</description>

    </property>

    <property>

        <name>hive.cli.print.header</name>

        <value>true</value>

        <description>Whether to print the names of the columns in query output.</description>

    </property>

    <property>

        <name>hive.cli.print.current.db</name>

        <value>true</value>

        <description>Whether to include the current database in the Hive prompt.</description>

    </property>

</configuration>

下面是示例代码

package spark_sql

import org.apache.spark.sql.SparkSession

import org.apache.spark.sql.types.{StringType, StructField, StructType}

import test.ProductData

/**

  * @Program: spark01

  * @Author: 努力就是魅力

  * @Since: 2018-10-19 08:30

  *         Description:

  *

  *         使用spark将内存中的数据写入到hive表中，这是一个可以完整运行的例子

  *

  *

  *    下面是hive表查询的结果

  *         hive (hadoop10)> select * from data_block;

  *         OK

  *         data_block.ip	data_block.time	data_block.phonenum

  *         40.234.66.122	2018-10-12 09:35:21

  *         5.150.203.160	2018-10-03 14:41:09	13389202989

  *

  **/

case class Datablock(ip: String, time:String, phoneNum:String)

object WriteTabletoHive {

  def main(args: Array[String]): Unit = {

    val spark = SparkSession

      .builder()

      .master("local[*]")

      .appName("WriteTableToHive")

      .config("spark.sql.warehouse.dir","D:\\reference-data\\spark01\\spark-warehouse")

      .enableHiveSupport()

      .getOrCreate()

    import spark.implicits._

    val schemaString = "ip time phoneNum"

    val fields = schemaString.split(" ")

      .map(fieldName => StructField(fieldName, StringType,nullable = true))

    val schema = StructType(fields)

   // val datablockDS = Seq(Datablock(ProductData.getRandomIp,ProductData.getRecentAMonthRandomTime("yyyy-MM-dd HH:mm:ss"),ProductData.getRandomPhoneNumber)).toDS()

 // val datablockDS = Seq(Datablock("192.168.40.122","2018-01-01 12:25:25","18866556699")).toDS()

    datablockDS.show()

    datablockDS.toDF().createOrReplaceTempView("dataBlock")

      spark.sql("select * from dataBlock")

        .write.mode("append")

        .saveAsTable("hadoop10.data_block")

  }

}

使用spark将内存中的数据写入到hive表中的更多相关文章

hbase使用MapReduce操作4（实现将 HDFS 中的数据写入到 HBase 表中）
实现将 HDFS 中的数据写入到 HBase 表中 Runner类 package com.yjsj.hbase_mr2; import com.yjsj.hbase_mr2.ReadFruitFro ...
将DataFrame数据如何写入到Hive表中
1.将DataFrame数据如何写入到Hive表中?2.通过那个API实现创建spark临时表?3.如何将DataFrame数据写入hive指定数据表的分区中? 从spark1.2 到spark1.3 ...
vlookup函数基本使用--如何将两个Excel表中的数据匹配；excel表中vlookup函数使用方法将一表引到另一表
vlookup函数基本使用--如何将两个Excel表中的数据匹配:excel表中vlookup函数使用方法将一表引到另一表一.将几个学生的籍贯匹配出来‘ 二.使用查找与引用函数 vlookup 三. ...
sql之将一个表中的数据注入另一个表中
sql之将一个表中的数据注入另一个表中需求:现有两张表t1,t2,现需要将t2的数据通过XZQHBM相同对应放入t1表中 t1: t2: 思路:left join 语句: select * from ...
SQL语句的使用,SELECT - 从数据库表中获取数据 UPDATE - 更新数据库表中的数据 DELETE - 从数据库表中删除数据 INSERT INTO - 向数据库表中插入数据
SQL DML 和 DDL 可以把 SQL 分为两个部分:数据操作语言 (DML) 和数据定义语言 (DDL). SQL (结构化查询语言)是用于执行查询的语法. 但是 SQL 语言也包含用于更新. ...
mysql从一个表中拷贝数据到另一个表中sql语句
这一段在找新的工作,今天面试时,要做一套题,其中遇到这么一句话,从一个表中拷贝所有的数据到另一个表中的sql是什么? 原来我很少用到,也没注意过这个问题,面试后我上网查查,回来自己亲手写了写,测试了下 ...
用sqoop将mysql的数据导入到hive表中
1:先将mysql一张表的数据用sqoop导入到hdfs中准备一张表需求将 bbs_product 表中的前100条数据导导出来只要id brand_id和 name 这3个字段数据存 ...
11.把文本文件的数据导入到Hive表中
先在hive里面创建一个表 create table mydb2.t3(id int,name string,age int) row format delimited fields terminat ...
将从数据库中获取的数据写入到Excel表中
pom.xml文件写入代码,maven自动加载poi-3.1-beta2.jar  & ...

随机推荐

异或加密 - cr2-many-time-secrets(攻防世界) - 异性相吸(buuctf)
Crib dragging attack 在开始了解 Crib dragging attack 之前,先来理一理异或. 异或加密 [详情请戳这里] XOR 加密简介异或加密特性: ① 两个值相同时 ...
苹果电脑不支持ntfs磁盘怎么办？用这一招轻松搞定！
ntfs是一种Windows NT内核的系列操作系统所支持的磁盘格式.相较于fat文件格式,ntfs彻底解决存储容量限制,可支持16Exabytes(1018),同时,ntfs也拥有更强的稳定性及安全 ...
jQuery 第二章实例方法 DOM操作选择元素相关方法
进一步选择元素相关方法: .get() .eq() .find() .filter() .not() .is() .has() .add()集中操作 .end()回退操作 .get() $(&qu ...
httpservlet类中两个service方法
在浏览器访问html页面时,当数据提交给servlet时发生了什么,这是我们需要了解的. 1.我们需要了解一下servlet的继承体系. servlet接口 ------->GenericSer ...
Eclipse的环境配置
1.想要配置Eclipse的环境,就要先下载Eclipse,并安装它,不会下载安装的小伙伴可以点击下面给的链接,里面有我写的详细的教程,这里就不重复了 Eclipse下载与安装:https://blo ...
2020.7.19 区间dp阶段测试
打崩了-- 事先说明,今天没有很在状态,所以题解就直接写在代码注释里的,非常抱歉 T1 颜色联通块此题有争议,建议跳过题目描述 N 个方块排成一排,第 i 个颜色为 Ci .定义一个颜色联通块 [ ...
论如何优雅的抛出SpringBoot注解的异常
平时我们在写代码的时候肯定要进行很多参数验证,最开始的时候我们一般都是这样处理的如下图看起来好像也没什么,但是如果参数多了呢?你就会看到这样的校验 OMG!!! 有没有感觉稍微有点视觉 ...
Docker 入门介绍
Docker是什么从发布到现在 docker一直很受关注,在一定程度是改变了软件行业如果你还不知道 docker 是什么是不是有点out了,接下来我们来介绍docker是什么,解决了什么问题,好处 ...
AndroidStudio中获得的VersionCode一直为1和VersionName一直为1.0
因为AndroidStudio把versionCode和versionName的维护放到了build.gradle中.
Plant Leaves Classification植物叶子分类：基于孪生网络的小样本学习方法
目录 Abstract Introduction PROPOSED CNN STRUCTURE INITIAL CNN ANALYSIS EXPERIMENTAL STRUCTURE AND ALGO ...

使用spark将内存中的数据写入到hive表中

使用spark将内存中的数据写入到hive表中

hive-site.xml

下面是示例代码

使用spark将内存中的数据写入到hive表中的更多相关文章

随机推荐

热门专题