KingbaseES例程之拥有大量索引的表导入数据
概述
如何快速插入大量数据比如几千万上亿的带索引的数据表。
数据准备
准备一个拥有二十个索引的数据表。
kingbase=# \d+ bigtab
Table "kingbase.bigtab"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+---------+-----------+----------+---------+----------+--------------+-------------
id | integer | | | | plain | |
c01 | integer | | | | plain | |
c02 | integer | | | | plain | |
c03 | integer | | | | plain | |
c04 | integer | | | | plain | |
c05 | integer | | | | plain | |
c06 | integer | | | | plain | |
c07 | integer | | | | plain | |
c08 | integer | | | | plain | |
c09 | integer | | | | plain | |
c10 | integer | | | | plain | |
c11 | integer | | | | plain | |
c12 | integer | | | | plain | |
c13 | integer | | | | plain | |
c14 | integer | | | | plain | |
c15 | integer | | | | plain | |
c16 | integer | | | | plain | |
c17 | integer | | | | plain | |
c18 | integer | | | | plain | |
c19 | integer | | | | plain | |
c20 | integer | | | | plain | |
c21 | integer | | | | plain | |
c22 | integer | | | | plain | |
c23 | integer | | | | plain | |
c24 | integer | | | | plain | |
c25 | integer | | | | plain | |
c26 | integer | | | | plain | |
c27 | integer | | | | plain | |
c28 | integer | | | | plain | |
c29 | integer | | | | plain | |
t01 | text | | | | extended | |
t02 | text | | | | extended | |
t03 | text | | | | extended | |
t04 | text | | | | extended | |
t05 | text | | | | extended | |
t06 | text | | | | extended | |
t07 | text | | | | extended | |
t08 | text | | | | extended | |
t09 | text | | | | extended | |
t10 | text | | | | extended | |
t11 | text | | | | extended | |
t12 | text | | | | extended | |
t13 | text | | | | extended | |
t14 | text | | | | extended | |
t15 | text | | | | extended | |
t16 | text | | | | extended | |
t17 | text | | | | extended | |
t18 | text | | | | extended | |
t19 | text | | | | extended | |
t20 | text | | | | extended | |
Indexes:
"bigtab_i01" btree (c01)
"bigtab_i02" btree (c02)
"bigtab_i03" btree (c03)
"bigtab_i04" btree (c04)
"bigtab_i05" btree (c05)
"bigtab_i06" btree (c06)
"bigtab_i07" btree (c07)
"bigtab_i08" btree (c08)
"bigtab_i09" btree (c09)
"bigtab_i10" btree (c10)
"bigtab_i11" btree (c11)
"bigtab_i12" btree (c12)
"bigtab_i13" btree (c13)
"bigtab_i14" btree (c14)
"bigtab_i15" btree (c15)
"bigtab_i16" btree (c16)
"bigtab_i17" btree (c17)
"bigtab_i18" btree (c18)
"bigtab_i19" btree (c19)
"bigtab_i20" btree (c20)
Access method: heap
kingbase=#
方法一:直接插入海量数据,自动维护索引
kingbase=#
kingbase=# insert into bigtab
kingbase-# select id
kingbase-# , (random() * 100)::int + 1000 c01
kingbase-# , (random() * 200)::int + 1000 c02
kingbase-# , (random() * 300)::int + 10000 c03
kingbase-# , (random() * 400)::int + 10000 c04
kingbase-# , (random() * 500)::int + 10000 c05
kingbase-# , (random() * 600)::int + 10000 c06
kingbase-# , (random() * 700)::int + 10000 c07
kingbase-# , (random() * 800)::int + 10000 c08
kingbase-# , (random() * 900)::int + 10000 c09
kingbase-# , (random() * 1000)::int + 10000 c10
kingbase-# , (random() * 2000)::int + 10000 c11
kingbase-# , (random() * 3000)::int + 10000 c12
kingbase-# , (random() * 4000)::int + 10000 c13
kingbase-# , (random() * 5000)::int + 10000 c14
kingbase-# , (random() * 6000)::int + 10000 c15
kingbase-# , (random() * 7000)::int + 10000 c16
kingbase-# , (random() * 8000)::int + 10000 c17
kingbase-# , (random() * 9000)::int + 10000 c18
kingbase-# , (random() * 10000)::int + 10000 c19
kingbase-# , (random() * 20000)::int + 10000 c20
kingbase-# , (random() * 30000)::int + 10000 c21
kingbase-# , (random() * 40000)::int + 10000 c22
kingbase-# , (random() * 50000)::int + 10000 c23
kingbase-# , (random() * 60000)::int + 10000 c24
kingbase-# , (random() * 70000)::int + 10000 c25
kingbase-# , (random() * 80000)::int + 10000 c26
kingbase-# , (random() * 90000)::int + 10000 c27
kingbase-# , (random() * 10000)::int + 10000 c28
kingbase-# , (random() * 10000)::int + 10000 c29
kingbase-# , md5(random()::text) t01
kingbase-# , md5(random()::text) t02
kingbase-# , md5(random()::text) t03
kingbase-# , md5(random()::text) t04
kingbase-# , md5(random()::text) t05
kingbase-# , md5(random()::text) t06
kingbase-# , md5(random()::text) t07
kingbase-# , md5(random()::text) t08
kingbase-# , md5(random()::text) t09
kingbase-# , md5(random()::text) t10
kingbase-# , md5(random()::text) t11
kingbase-# , md5(random()::text) t12
kingbase-# , md5(random()::text) t13
kingbase-# , md5(random()::text) t14
kingbase-# , md5(random()::text) t15
kingbase-# , md5(random()::text) t16
kingbase-# , md5(random()::text) t17
kingbase-# , md5(random()::text) t18
kingbase-# , md5(random()::text) t19
kingbase-# , md5(random()::text) t20
kingbase-# from generate_series(1, 2000000) id;
INSERT 0 2000000
Time: 299331.143 ms (04:59.331)
优点: 语句单一;自动维护索引;自动支持之后的索引。
缺点: 逐行维护索引,造成用时较长。
方法二:删除索引,插入海量数据,再创建索引
kingbase=#
kingbase=# do
kingbase-# $$
kingbase$# begin
kingbase$# drop index bigtab_i01;
kingbase$# drop index bigtab_i02;
kingbase$# drop index bigtab_i03;
kingbase$# drop index bigtab_i04;
kingbase$# drop index bigtab_i05;
kingbase$# drop index bigtab_i06;
kingbase$# drop index bigtab_i07;
kingbase$# drop index bigtab_i08;
kingbase$# drop index bigtab_i09;
kingbase$# drop index bigtab_i10;
kingbase$# drop index bigtab_i11;
kingbase$# drop index bigtab_i12;
kingbase$# drop index bigtab_i13;
kingbase$# drop index bigtab_i14;
kingbase$# drop index bigtab_i15;
kingbase$# drop index bigtab_i16;
kingbase$# drop index bigtab_i17;
kingbase$# drop index bigtab_i18;
kingbase$# drop index bigtab_i19;
kingbase$# drop index bigtab_i20;
kingbase$#
kingbase$# insert into bigtab
kingbase$# select id
kingbase$# , (random() * 100)::int + 1000 c01
kingbase$# , (random() * 200)::int + 1000 c02
kingbase$# , (random() * 300)::int + 10000 c03
kingbase$# , (random() * 400)::int + 10000 c04
kingbase$# , (random() * 500)::int + 10000 c05
kingbase$# , (random() * 600)::int + 10000 c06
kingbase$# , (random() * 700)::int + 10000 c07
kingbase$# , (random() * 800)::int + 10000 c08
kingbase$# , (random() * 900)::int + 10000 c09
kingbase$# , (random() * 1000)::int + 10000 c10
kingbase$# , (random() * 2000)::int + 10000 c11
kingbase$# , (random() * 3000)::int + 10000 c12
kingbase$# , (random() * 4000)::int + 10000 c13
kingbase$# , (random() * 5000)::int + 10000 c14
kingbase$# , (random() * 6000)::int + 10000 c15
kingbase$# , (random() * 7000)::int + 10000 c16
kingbase$# , (random() * 8000)::int + 10000 c17
kingbase$# , (random() * 9000)::int + 10000 c18
kingbase$# , (random() * 10000)::int + 10000 c19
kingbase$# , (random() * 20000)::int + 10000 c20
kingbase$# , (random() * 30000)::int + 10000 c21
kingbase$# , (random() * 40000)::int + 10000 c22
kingbase$# , (random() * 50000)::int + 10000 c23
kingbase$# , (random() * 60000)::int + 10000 c24
kingbase$# , (random() * 70000)::int + 10000 c25
kingbase$# , (random() * 80000)::int + 10000 c26
kingbase$# , (random() * 90000)::int + 10000 c27
kingbase$# , (random() * 10000)::int + 10000 c28
kingbase$# , (random() * 10000)::int + 10000 c29
kingbase$# , md5(random()::text) t01
kingbase$# , md5(random()::text) t02
kingbase$# , md5(random()::text) t03
kingbase$# , md5(random()::text) t04
kingbase$# , md5(random()::text) t05
kingbase$# , md5(random()::text) t06
kingbase$# , md5(random()::text) t07
kingbase$# , md5(random()::text) t08
kingbase$# , md5(random()::text) t09
kingbase$# , md5(random()::text) t10
kingbase$# , md5(random()::text) t11
kingbase$# , md5(random()::text) t12
kingbase$# , md5(random()::text) t13
kingbase$# , md5(random()::text) t14
kingbase$# , md5(random()::text) t15
kingbase$# , md5(random()::text) t16
kingbase$# , md5(random()::text) t17
kingbase$# , md5(random()::text) t18
kingbase$# , md5(random()::text) t19
kingbase$# , md5(random()::text) t20
kingbase$# from generate_series(1, 2000000) id;
kingbase$#
kingbase$# create index bigtab_i01 on bigtab (c01);
kingbase$# create index bigtab_i02 on bigtab (c02);
kingbase$# create index bigtab_i03 on bigtab (c03);
kingbase$# create index bigtab_i04 on bigtab (c04);
kingbase$# create index bigtab_i05 on bigtab (c05);
kingbase$# create index bigtab_i06 on bigtab (c06);
kingbase$# create index bigtab_i07 on bigtab (c07);
kingbase$# create index bigtab_i08 on bigtab (c08);
kingbase$# create index bigtab_i09 on bigtab (c09);
kingbase$# create index bigtab_i10 on bigtab (c10);
kingbase$# create index bigtab_i11 on bigtab (c11);
kingbase$# create index bigtab_i12 on bigtab (c12);
kingbase$# create index bigtab_i13 on bigtab (c13);
kingbase$# create index bigtab_i14 on bigtab (c14);
kingbase$# create index bigtab_i15 on bigtab (c15);
kingbase$# create index bigtab_i16 on bigtab (c16);
kingbase$# create index bigtab_i17 on bigtab (c17);
kingbase$# create index bigtab_i18 on bigtab (c18);
kingbase$# create index bigtab_i19 on bigtab (c19);
kingbase$# create index bigtab_i20 on bigtab (c20);
kingbase$#
kingbase$# end;
kingbase$# $$;
ANONYMOUS BLOCK
Time: 83069.170 ms (01:23.069)
优点: 批量维护索引,用时最短。
缺点: 语句复杂且固化;手动维护删建索引语句;不支持之后的索引。
方法三:禁止索引更改,插入海量数据,重建表的全部索引
kingbase=# do
kingbase-# $$
kingbase$# begin
kingbase$#
kingbase$# update pg_index
kingbase$# set indislive= false
kingbase$# where indrelid = 'bigtab'::regclass;
kingbase$#
kingbase$# insert into bigtab
kingbase$# select id
kingbase$# , (random() * 100)::int + 1000 c01
kingbase$# , (random() * 200)::int + 1000 c02
kingbase$# , (random() * 300)::int + 10000 c03
kingbase$# , (random() * 400)::int + 10000 c04
kingbase$# , (random() * 500)::int + 10000 c05
kingbase$# , (random() * 600)::int + 10000 c06
kingbase$# , (random() * 700)::int + 10000 c07
kingbase$# , (random() * 800)::int + 10000 c08
kingbase$# , (random() * 900)::int + 10000 c09
kingbase$# , (random() * 1000)::int + 10000 c10
kingbase$# , (random() * 2000)::int + 10000 c11
kingbase$# , (random() * 3000)::int + 10000 c12
kingbase$# , (random() * 4000)::int + 10000 c13
kingbase$# , (random() * 5000)::int + 10000 c14
kingbase$# , (random() * 6000)::int + 10000 c15
kingbase$# , (random() * 7000)::int + 10000 c16
kingbase$# , (random() * 8000)::int + 10000 c17
kingbase$# , (random() * 9000)::int + 10000 c18
kingbase$# , (random() * 10000)::int + 10000 c19
kingbase$# , (random() * 20000)::int + 10000 c20
kingbase$# , (random() * 30000)::int + 10000 c21
kingbase$# , (random() * 40000)::int + 10000 c22
kingbase$# , (random() * 50000)::int + 10000 c23
kingbase$# , (random() * 60000)::int + 10000 c24
kingbase$# , (random() * 70000)::int + 10000 c25
kingbase$# , (random() * 80000)::int + 10000 c26
kingbase$# , (random() * 90000)::int + 10000 c27
kingbase$# , (random() * 10000)::int + 10000 c28
kingbase$# , (random() * 10000)::int + 10000 c29
kingbase$# , md5(random()::text) t01
kingbase$# , md5(random()::text) t02
kingbase$# , md5(random()::text) t03
kingbase$# , md5(random()::text) t04
kingbase$# , md5(random()::text) t05
kingbase$# , md5(random()::text) t06
kingbase$# , md5(random()::text) t07
kingbase$# , md5(random()::text) t08
kingbase$# , md5(random()::text) t09
kingbase$# , md5(random()::text) t10
kingbase$# , md5(random()::text) t11
kingbase$# , md5(random()::text) t12
kingbase$# , md5(random()::text) t13
kingbase$# , md5(random()::text) t14
kingbase$# , md5(random()::text) t15
kingbase$# , md5(random()::text) t16
kingbase$# , md5(random()::text) t17
kingbase$# , md5(random()::text) t18
kingbase$# , md5(random()::text) t19
kingbase$# , md5(random()::text) t20
kingbase$# from generate_series(1, 2000000) id;
kingbase$#
kingbase$# update pg_index
kingbase$# set indislive= true
kingbase$# where indrelid = 'bigtab'::regclass;
kingbase$#
kingbase$# analyse bigtab;
kingbase$# reindex table bigtab;
kingbase$#
kingbase$# end;
kingbase$# $$;
ANONYMOUS BLOCK
Time: 87110.126 ms (01:27.110)
优点: 批量维护索引,用时短;语句固定模式;自动维护索引;支持之后的索引。
缺点: 多个SQL语句,不易嵌入语句块。
最后的话
reindex table 的执行依赖统计信息,所以需要执行 analyse table ,才能成功重建表的全部可更新的索引。
reindex index 不受上述因素的影响,可以强制重建不更新的索引,并自动修改 indislive= true。
如果在REINDEX期间出现异常,那么所有需要rebuild的索引的状态都是invalid,意味着这些索引仍然占用空间,定义仍在但不能使用。
避免REINDEX期间出现异常,可以在索引更新操作时,跳过唯一索引和外键依赖索引等。
KingbaseES例程之拥有大量索引的表导入数据的更多相关文章
- U8API——向U8数据库表导入数据
一.打开API资源管理器 替换两个引用 打开应用实例,选择相应的功能 复制相应的封装类到自己的目录下 在数据库新建临时表,与目标表相同 数据导入: 思路:先将要导入的数据导入到与U8目标表相同的临时表 ...
- mysql单表导入数据,全量备份导入单表
(1)“导出”表 导出表是在备份的prepare阶段进行的,因此,一旦完全备份完成,就可以在prepare过程中通过--export选项将某表导出了: innobackupex --apply-log ...
- asp.net 从Excel表导入数据到数据库中
http://www.cnblogs.com/hfzsjz/archive/2010/12/31/1922901.html http://hi.baidu.com/ctguyg/item/ebc857 ...
- 关于mysql 表导入数据
一.实验准备: 1.实验设备:Dell laptop 7559; 2.实验环境:windows 10操作系统; 3.数据库版本:mysql 8.0; 二.实验目的: 1.将一个宠物表pet.txt文件 ...
- oracle RAC 11g sqlload 生产表导入数据(ORA-12899)
背景:由于即将来临的双十一,业务部门(我司是做京东,天猫的短信服务),短信入库慢,需要DBA把数据库sqlload进数据库. 表结构如下: MRS VARCHAR2(100), STATUS VARC ...
- 从Excel表导入数据到Table
步骤: 1.写第一行SQL,(本sql对应的是oracle数据库) ="INSERT INTO TD_PROMOTION_RATE VALUES("&A3&&quo ...
- hive 建表导入数据
1. hive> create table wyp > (id int, name string, > age int, tel string) > ROW FORMAT DE ...
- Hive创建表|数据的导入|数据导出的几种方式
* Hive创建表的三种方式 1.使用create命令创建一个新表 例如:create table if not exists db_web_data.track_log(字段) partitione ...
- SQL Server 索引和表体系结构(聚集索引)
聚集索引 概述 关于索引和表体系结构的概念一直都是讨论比较多的话题,其中表的各种存储形式是讨论的重点,在各个网站上面也有很多关于这方面写的不错的文章,我写这篇文章的目的也是为了将所有的知识点尽可能的组 ...
随机推荐
- SAP Grid control( ALV Grid 列表 自定义 按钮)
ALV 列表和按钮 效果 源代码 PROGRAM bcalvc_tb_menu_with_def_but. *&&&&&&&&& ...
- 无语——真的好用到不行的7个Python小技巧
本文总结了我几个我在学习python过程中,用到的几个超好用的操作,这里分享给大家,我相信你们也会非常喜欢,目录如下.这里提前索要再看,记得点一点再看哦.这只是其中一些技巧,以后会慢慢和大家分享. 1 ...
- java 改变图片的DPI
代码如下: public class test01 { private static int DPI = 300; public static void main(String[] args) { S ...
- Maven配置【详细】
参考网址:https://www.jianshu.com/p/f2f52a062d5b
- 『现学现忘』Git后悔药 — 27、版本回退介绍
目录 1.什么版本回退 2.需要了解两个知识点 (1)HEAD是什么 (2)HEAD指针用法 3.git reflog命令介绍 1.什么版本回退 版本回退也可以叫回滚. 若修改过的文件,不仅添加到了暂 ...
- 【cartographer_ros】六: 发布和订阅路标landmark信息
上一节介绍了陀螺仪Imu传感数据的订阅和发布. 本节会介绍路标Landmark数据的发布和订阅.Landmark在cartographer中作为定位的修正补充,避免定位丢失. 这里着重解释一下Land ...
- wcf连接数据库用sqlhelper,连接数一直没有释放反而增加
找了一天,发现原因是配置的连接字符串没有加上最大连接数,所以每次请求都是一直增加,而MariaDB默认的连接数是151,为了本地多项目测试已改成以前. 下面是配置的连接字符串: <add na ...
- 牛客SQL刷题第三趴——SQL大厂面试真题
01 某音短视频 SQL156 各个视频的平均完播率 [描述]用户-视频互动表tb_user_video_log.(uid-用户ID, video_id-视频ID, start_time-开始观看时间 ...
- RK3568开发笔记(四):在虚拟机上使用SDK编译制作uboot、kernel和buildroot镜像
前言 上一篇搭建好了ubuntu宿主机开发环境,本篇的目标系统主要是开发linux+qt,所以需要刷上billdroot+Qt创建的系统,为了更好的熟悉原理和整个开发过程,选择从零开始搭建rk35 ...
- 高级数据结构学习笔记 / Data Structure(updating)
树状数组 查询操作:O(logn) 修改操作:O(logn) #define lowbit(x) (x & -x) int tr[N]; // 树状数组 // 添加c个大小为x的数值 vo ...