如何快速把hdfs数据动态导入到hive表
1. hdfs 文件
{"retCode":1,"retMsg":"Success","data":[{"secID":"000001.XSHE","ticker":"000001","secShortName":"深发展A","exchangeCD":"XSHE","tradeDate":"1991-10-21","preClosePrice":24,"actPreClosePrice":24,"openPrice":24,"highestPrice":24.4,"lowestPrice":23.85,"closePrice":23.9,"turnoverVol":355700,"turnoverValue":8582250,"turnoverRate":0.0058,"accumAdjFactor":0.0117201563,"negMarketValue":1462295257.8,"marketValue":2145064267.7,"PB":2.2666,"isOpen":1},{"secID":"000002.XSHE","ticker":"000002","secShortName":"深万科A","exchangeCD":"XSHE","tradeDate":"1991-10-21","preClosePrice":8,"actPreClosePrice":8,"openPrice":8,"highestPrice":8,"lowestPrice":7.7,"closePrice":7.9,"turnoverVol":375000,"turnoverValue":2944200,"turnoverRate":0.0066,"accumAdjFactor":0.0117337592,"negMarketValue":451011000,"marketValue":615927450,"PB":1.0001,"isOpen":1},{"secID":"000004.XSHE","ticker":"000004","secShortName":"深安达A","exchangeCD":"XSHE","tradeDate":"1991-10-21","preClosePrice":7.25,"actPreClosePrice":7.25,"openPrice":7.25,"highestPrice":7.25,"lowestPrice":7.2,"closePrice":7.2,"turnoverVol":92000,"turnoverValue":665125,"turnoverRate":0.0078,"accumAdjFactor":0.2649084628,"negMarketValue":84977100,"marketValue":175500000,"PB":7.4199,"isOpen":1},{"secID":"000005.XSHE","ticker":"000005","secShortName":"深原野A","exchangeCD":"XSHE","tradeDate":"1991-10-21","preClosePrice":6.46,"actPreClosePrice":6.46,"openPrice":6.49,"highestPrice":6.49,"lowestPrice":6.49,"closePrice":6.49,"turnoverVol":94500,"turnoverValue":613305,"turnoverRate":0.0021,"accumAdjFactor":0.1016459912,"negMarketValue":287756865,"marketValue":584100000,"PB":9.1783,"isOpen":1},{"secID":"000009.XSHE","ticker":"000009","secShortName":"深宝安A","exchangeCD":"XSHE","tradeDate":"1991-10-21","preClosePrice":5.75,"actPreClosePrice":5.75,"openPrice":5.7,"highestPrice":5.8,"lowestPrice":5.65,"closePrice":5.75,"turnoverVol":767500,"turnoverValue":4382245,"turnoverRate":0.0084,"accumAdjFactor":0.1026538759,"negMarketValue":524745000,"marketValue":1293922500,"PB":2.4503,"isOpen":1},{"secID":"600601.XSHG","ticker":"600601","secShortName":"延中实业","exchangeCD":"XSHG","tradeDate":"1991-10-21","preClosePrice":65.7,"actPreClosePrice":65.7,"openPrice":66.4,"highestPrice":66.4,"lowestPrice":66.4,"closePrice":66.4,"turnoverVol":5333,"turnoverValue":354111,"dealAmount":81,"turnoverRate":0.0053,"accumAdjFactor":0.0010592167,"negMarketValue":66400000,"marketValue":66400000,"PB":40.7703,"isOpen":1},{"secID":"600602.XSHG","ticker":"600602","secShortName":"真空电子","exchangeCD":"XSHG","tradeDate":"1991-10-21","preClosePrice":640.6,"actPreClosePrice":640.6,"openPrice":647,"highestPrice":647,"lowestPrice":647,"closePrice":647,"turnoverVol":2589,"turnoverValue":1675083,"dealAmount":227,"turnoverRate":0.0051,"accumAdjFactor":0.0019640692,"negMarketValue":330552300,"marketValue":1294000000,"PB":287.6707,"isOpen":1},{"secID":"600651.XSHG","ticker":"600651","secShortName":"飞乐音响","exchangeCD":"XSHG","tradeDate":"1991-10-21","preClosePrice":119.6,"actPreClosePrice":119.6,"openPrice":120.8,"highestPrice":120.8,"lowestPrice":120.8,"closePrice":120.8,"turnoverVol":1102,"turnoverValue":133122,"dealAmount":14,"turnoverRate":0.0022,"accumAdjFactor":0.0008192464,"negMarketValue":60400000,"marketValue":60400000,"PB":39.6397,"isOpen":1},{"secID":"600652.XSHG","ticker":"600652","secShortName":"爱使电子","exchangeCD":"XSHG","tradeDate":"1991-10-21","preClosePrice":83.2,"actPreClosePrice":83.2,"openPrice":0,"highestPrice":0,"lowestPrice":0,"closePrice":83.2,"turnoverVol":0,"turnoverValue":0,"dealAmount":0,"turnoverRate":0,"accumAdjFactor":0.0006920481,"negMarketValue":22464000,"marketValue":22464000,"PB":33.8019,"isOpen":0},{"secID":"600653.XSHG","ticker":"600653","secShortName":"申华电工","exchangeCD":"XSHG","tradeDate":"1991-10-21","preClosePrice":103.4,"actPreClosePrice":103.4,"openPrice":104.4,"highestPrice":104.4,"lowestPrice":104.4,"closePrice":104.4,"turnoverVol":240,"turnoverValue":25056,"dealAmount":4,"turnoverRate":0.0005,"accumAdjFactor":0.0009289199,"negMarketValue":52200000,"marketValue":52200000,"PB":97.279,"isOpen":1},{"secID":"600654.XSHG","ticker":"600654","secShortName":"飞乐股份","exchangeCD":"XSHG","tradeDate":"1991-10-21","preClosePrice":633.2,"actPreClosePrice":633.2,"openPrice":639.5,"highestPrice":639.5,"lowestPrice":639.5,"closePrice":639.5,"turnoverVol":101,"turnoverValue":64590,"dealAmount":26,"turnoverRate":0.0048,"accumAdjFactor":0.000663586,"negMarketValue":13429500,"marketValue":134358950,"PB":282.9834,"isOpen":1},{"secID":"600656.XSHG","ticker":"600656","secShortName":"浙江凤凰","exchangeCD":"XSHG","tradeDate":"1991-10-21","preClosePrice":1242.9,"actPreClosePrice":1242.9,"openPrice":1255.3,"highestPrice":1255.3,"lowestPrice":1255.3,"closePrice":1255.3,"turnoverVol":140,"turnoverValue":175742,"dealAmount":7,"turnoverRate":0.0031,"accumAdjFactor":0.0007136096,"negMarketValue":56502308.3,"marketValue":321798665.6,"PB":-604.4303,"isOpen":1}]}
2. 创建 hive 临时表
CREATE EXTERNAL TABLE if not exists sensitop.equd_json_tmp (
retCode string,
retMsg string,
data array<struct<
secID: string,
tradeDate: date,
ticker: string,
secShortName: string,
exchangeCD: string,
preClosePrice: double,
actPreClosePrice: double,
openPrice: double,
highestPrice: double,
lowestPrice: double,
closePrice: double,
turnoverVol: double,
turnoverValue: double,
dealAmount: int,
turnoverRate: double,
accumAdjFactor: double,
negMarketValue: double,
marketValue: double,
PE: double,
PE1: double,
PB: double,
isOpen: int>>)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
LOCATION 'hdfs://hdfs1.wdp:8020/sensitop/finance/equd';
3. 创建 hive 表
需要把上面表里数组里的数据一条一条放入这个表:
CREATE TABLE if not exists sensitop.equd_h(
secID string,
ticker string,
secShortName string,
exchangeCD string,
tradeDate date,
preClosePrice double,
actPreClosePrice double,
openPrice double,
highestPrice double,
lowestPrice double,
closePrice double,
turnoverVol double,
turnoverValue double,
dealAmount int,
turnoverRate double,
accumAdjFactor double,
negMarketValue double,
marketValue double,
PE double,
PE1 double,
PB double,
isOpen int)
partitioned by (year string)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
然后新建一个最张表
CREATE TABLE if not exists sensitop.equd(
secID string,
ticker string,
secShortName string,
exchangeCD string,
tradeDate date,
preClosePrice double,
actPreClosePrice double,
openPrice double,
highestPrice double,
lowestPrice double,
closePrice double,
turnoverVol double,
turnoverValue double,
dealAmount int,
turnoverRate double,
accumAdjFactor double,
negMarketValue double,
marketValue double,
PE double,
PE1 double,
PB double,
isOpen int)
partitioned by (year string)
注意:这里的字段顺序和上面临时表的顺序要一致。
4. 用 Partition 更新数据
insert overwrite table sensitop.equd_tmp
partition (year='2016')
select b.dt.secID,
b.dt.ticker,
b.dt.secShortName,
b.dt.exchangeCD,
b.dt.tradeDate,
b.dt.preClosePrice,
b.dt.actPreClosePrice,
b.dt.openPrice,
b.dt.highestPrice,
b.dt.lowestPrice,
b.dt.closePrice,
b.dt.turnoverVol,
b.dt.turnoverValue,
b.dt.dealAmount,
b.dt.turnoverRate,
b.dt.accumAdjFactor,
b.dt.negMarketValue,
b.dt.marketValue,
b.dt.PE,
b.dt.PE1,
b.dt.PB,
b.dt.isOpen
from sensitop.equd_json_tmp LATERAL VIEW explode(equd_json_tmp.data) b AS dt
where dt.tradedate >= '2016-01-01' and dt.tradedate <= '2016-12-31';
insert overwrite table sensitop.equd
partition (year='2016')
select secID,
ticker,
secShortName,
exchangeCD,
tradeDate,
preClosePrice,
actPreClosePrice,
openPrice,
highestPrice,
lowestPrice,
closePrice,
turnoverVol,
turnoverValue,
dealAmount,
turnoverRate,
accumAdjFactor,
negMarketValue,
marketValue,
PE,
PE1,
PB,
isOpen
from sensitop.equd_tmp dt
where year = '2016';
5. 用nifi实现动态插入数据
这里有二个分支,左边一个是每天20:00更新当年的partion; 右边一个是更新1990 到 2015 年的数据,而且只需要更新一次。
insert overwrite table sensitop.equd_h
partition (year='${year}')
select b.dt.secID,
b.dt.ticker,
b.dt.secShortName,
b.dt.exchangeCD,
b.dt.tradeDate,
b.dt.preClosePrice,
b.dt.actPreClosePrice,
b.dt.openPrice,
b.dt.highestPrice,
b.dt.lowestPrice,
b.dt.closePrice,
b.dt.turnoverVol,
b.dt.turnoverValue,
b.dt.dealAmount,
b.dt.turnoverRate,
b.dt.accumAdjFactor,
b.dt.negMarketValue,
b.dt.marketValue,
b.dt.PE,
b.dt.PE1,
b.dt.PB,
b.dt.isOpen
from sensitop.equd_json_tmp LATERAL VIEW explode(equd_json_tmp.data) b AS dt
where dt.tradedate >= '${year}-01-01' and dt.tradedate <= '${year}-12-31'
insert overwrite table sensitop.equd
partition (year='${year}')
select secID,
ticker,
secShortName,
exchangeCD,
tradeDate,
preClosePrice,
actPreClosePrice,
openPrice,
highestPrice,
lowestPrice,
closePrice,
turnoverVol,
turnoverValue,
dealAmount,
turnoverRate,
accumAdjFactor,
negMarketValue,
marketValue,
PE,
PE1,
PB,
isOpen
from sensitop.equd_tmp dt
where year = '${year}'
NIFI 中国社区 QQ群:595034369
如何快速把hdfs数据动态导入到hive表的更多相关文章
- 11.把文本文件的数据导入到Hive表中
先在hive里面创建一个表 create table mydb2.t3(id int,name string,age int) row format delimited fields terminat ...
- 使用 sqoop 将mysql数据导入到hive表(import)
Sqoop将mysql数据导入到hive表中 先在mysql创建表 CREATE TABLE `sqoop_test` ( `id` ) DEFAULT NULL, `name` varchar() ...
- mysql中把一个表的数据批量导入另一个表中
mysql中把一个表的数据批量导入另一个表中 不管是在网站开发还是在应用程序开发中,我们经常会碰到需要将MySQL或MS SQLServer某个表的数据批量导入到另一个表的情况,甚至有时还需要指定 ...
- 大数据开发实战:Hive表DDL和DML
1.Hive 表 DDL 1.1.创建表 Hive中创建表的完整语法如下: CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name [ (col_nam ...
- 将DataFrame数据如何写入到Hive表中
1.将DataFrame数据如何写入到Hive表中?2.通过那个API实现创建spark临时表?3.如何将DataFrame数据写入hive指定数据表的分区中? 从spark1.2 到spark1.3 ...
- 用sqoop将mysql的数据导入到hive表中
1:先将mysql一张表的数据用sqoop导入到hdfs中 准备一张表 需求 将 bbs_product 表中的前100条数据导 导出来 只要id brand_id和 name 这3个字段 数据存 ...
- 用puthivestreaming把hdfs里的数据流到hive表
全景图: 1. 创建hive表 CREATE TABLE IF NOT EXISTS newsinfo.test( name STRING ) CLUSTERED BY (name)INTO 3 ...
- 用sqoop将mysql的数据导入到hive表
一.先将mysql一张表的数据用sqoop导入到hdfs 1.1.先在mysql中准备一张测试用的表 mysql> desc user_info; +-----------+---------- ...
- 从Oracle导出数据并导入到Hive
1.配置源和目标的数据连接 源(oracle): 目标(Hive 2.1.1),需要事先将hive的驱动程序导入HHDI的lib目录中. Hive2.1.1需要的jar包如下:可根据自身情况更换had ...
随机推荐
- git & scp
git & scp command : git & scp git git 提交 git checkout/pull =====[在提交前校验远程是否有冲突] git add [< ...
- Asp.Net MVC4入门指南(8):给数据模型添加校验器
在本节中将会给Movie模型添加验证逻辑.并且确保这些验证规则在用户创建或编辑电影时被执行. 保持事情 DRY ASP.NET MVC 的核心设计信条之一是DRY: "不要重复自己(Don’ ...
- PHP高并发高负载系统架构
PHP高并发高负载系统架构 1.为什么要进行高并发和高负载的研究 1.1.产品发展的需要 1.2.公司发展的需要 1.3.当前形式决定的 2.高并发和高负载的约束条件 2.1.硬件 2.2.部署 2. ...
- c# 写着玩的,两个Task并发,一个写队列一个读队列的异常情况
class Program { class TestEnqueue { static Queue<string> str = new Queue<string>(); publ ...
- java 多态2
http://www.cnblogs.com/wqq0402/p/6180685.html package test05; public class DuoTai_Test02 { /**多个对象,一 ...
- php发展起源
PHP原始为Personal Home Page的缩写,已经正式更名为 "PHP: Hypertext Preprocessor".注意不是“Hypertext Preproces ...
- c#数据库访问读取数据速度测试
1,使用byte数据读取 2,使用dataset数据读取
- javascript中array常用属性方法
属性: length 表示一个无符号 32-bit 整数,返回一个数组中的元素个数. 截短数组..截短至长度2 则: .length = 2 方法: Array.from() 方法可以将一个类数 ...
- Poj.Grids 2951 浮点数求高精度幂
2951:浮点数求高精度幂 总时间限制: 1000ms 内存限制: 65536kB 描述 有一个实数 R ( 0.0 < R < 99.999 ) ,要求写程序精确计算 R 的 n 次方. ...
- Codeforces #380 div2 E(729E) Subordinates
E. Subordinates time limit per test 1 second memory limit per test 256 megabytes input standard inpu ...