分批插入数据基于mybatis

DB框架：Mybatis。DataBase:Oracle。

----------------------------------------------------------------------------

批量插入数据方式：

一、Mybatis 全局设置批处理；

二、Mybatis 局部设置批处理；

三、Mybatis foreach批量插入：

①SELECT UNION ALL；

②BEGIN INSERT INTO ...;INSERT INTO...;...;END;

四、java自带的批处理插入；

五、其他方式

-----------------------------------------------------------------------------

先说结论：Mybatis(全局/局部)批处理和java自带的批处理性能上差不多，属于最优处理办法，我这边各种测试后，最后采用Mybatis局部批处理方式。

一、Mybatis 全局设置批处理

先上Spring-Mybatis.xml 配置信息

<?xml version="1.0" encoding="UTF-8"?>

<beans xmlns="http://www.springframework.org/schema/beans"

       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

       xmlns:context="http://www.springframework.org/schema/context" xmlns:tx="http://www.springframework.org/schema/tx"

       xsi:schemaLocation="http://www.springframework.org/schema/beans

                           http://www.springframework.org/schema/beans/spring-beans.xsd

                           http://www.springframework.org/schema/context

                           http://www.springframework.org/schema/context/spring-context.xsd http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx.xsd">

    <!-- 自动扫描(自动注入) -->

    <context:annotation-config/>

    <context:component-scan base-package="com.company.dao"/>

    <!-- 动态数据源 -->

    <bean id="dataSource" class="com.company.dao.datasource.DataSource">

        <property name="myConfigFile" value="mySource.xml"/>

    </bean>

    <!-- mybatis配置 -->

    <bean id="sqlSessionFactory" class="org.mybatis.spring.SqlSessionFactoryBean">

        <property name="dataSource" ref="dataSource"/>

        <property name="mapperLocations" value="classpath*:mapper/*/*/*.xml"/>

        <property name="configLocation" value="classpath:/mybatisConfig.xml"/>

    </bean>

    <!-- 自动创建映射器，不用单独为每个 mapper映射-->

    <bean class="org.mybatis.spring.mapper.MapperScannerConfigurer">

        <property name="basePackage" value="com.company.dao.mapper"/>

        <property name="sqlSessionFactoryBeanName" value="sqlSessionFactory"/>

    </bean>

    <!-- 事务管理器配置,单数据源事务 -->

    <bean id="transactionManager"

          class="org.springframework.jdbc.datasource.DataSourceTransactionManager">

        <property name="dataSource" ref="dataSource"/>

    </bean>

    <tx:annotation-driven transaction-manager="transactionManager"/>

</beans>

Spring-Mybatis.xml

再上mybatisConfig.xml（在本项目中，我没有设置setting。最终采用的局部批处理，因此未设置全局批处理，具体原因后面再说。）

<?mapper.xml version="1.0" encoding="UTF-8" ?>

<!DOCTYPE configuration PUBLIC "-//mybatis.org//DTD Config 3.0//EN" "http://mybatis.org/dtd/mybatis-3-config.dtd">

<configuration>

    <settings>

        <!-- 配置默认的执行器。SIMPLE 就是普通的执行器；REUSE 执行器会重用预处理语句（prepared statements）； BATCH 执行器将重用语句并执行批量更新。-->

        <setting name="defaultExecutorType" value="BATCH"/>

        <!--详见：http://www.mybatis.org/mybatis-3/zh/configuration.html-->

    </settings>

    <!-- 别名列表 -->

    <typeAliases>

       <!-- typeAliases 中的配置都是配置别名，在此就不贴出来了 -->

    </typeAliases>

</configuration>

mybatisConfig.xml

这样子设置好后，在BaseService开放saveBatch(List<T> list)方法

@Override

    public void save(List<T> list) {

        for (int i = 0;i < list.size();i++){

            mapper.insert(list.get(i));

        }

    }

    @Override

    public void saveBatch(List<T> list) {

        int size = list.size();

        int unitNum = 500;

        int startIndex = 0;

        int endIndex = 0;

        while (size > 0){

            if(size > unitNum){

                endIndex = startIndex+unitNum;

            }else {

                endIndex = startIndex+size;

            }

            List<T> insertData = list.subList(startIndex,endIndex);

            save(insertData);

            size = size - unitNum;

            startIndex = endIndex;

        }

    }

BaseService.saveBatch(List list)

虽然看上去是500条记录，一次次INSERT INTO，但由于在全局已经设置Mybatis是批处理执行器，所以这500条INSERT INTO只会与Oracle数据库通信一次。

全局设置批处理的局限性在哪里呢？

先附上mybatis官方的讨论列表中最很关键的一句：“If the BATCH executor is in use, the update counts are being lost. ”

设置全局批处理后，DB里的insert、Update和delete方法，都无法返回进行DML影响DB_TABLE的行数。

1.insert 无法返回影响的行数，这个好解决，一个批处理放在一个事务里，记录批处理失败次数，总数-批处理失败次数*单位批处理数据量，就能得到insert 影响DB_TABLE的行数；

2.但是update和delete就无法很简单的去统计影响行数了，如果做反复查询，反而降低了效率，得不偿失。

虽现在的项目尚未有需要反馈影响DB_TABLE行数的需求，但是为了更灵活，我们放弃了全局批处理的方式。

!这里提个疑问：为什么Mybatis官方，不将批处理的选择方式下沉到方法级别？方便开发者根据实际情况，灵活选择。我觉得这是个可以改进的地方，如有机会，可看源码去进行改进。

---------------------------------------------------------------------------------------------------------

二、Mybatis局部批处理方式

由于领导说全局批处理方式，不够灵活，不适宜项目所需，要另想办法支持。但是java自带的批处理，因为项目代码管理的要求，也不能采用。因此，在仔细阅读官方文档后，设想自己能否获取SQLSession后openSession，将这个会话设置为批处理呢？

先看MyBatis官方网站(须FanQiang)：http://www.mybatis.org/mybatis-3/zh/getting-started.html

SqlSession session = sqlSessionFactory.openSession();

try {

  BlogMapper mapper = session.getMapper(BlogMapper.class);

  // do work

} finally {

  session.close();

}

官方建议的写法

后查阅Mybatis java API(须FanQiang): http://www.mybatis.org/mybatis-3/zh/java-api.html

现在你有一个 SqlSessionFactory,可以用来创建 SqlSession 实例。

SqlSessionFactory

SqlSessionFactory 有六个方法可以用来创建 SqlSession 实例。通常来说,如何决定是你选择下面这些方法时:

Transaction (事务): 你想为 session 使用事务或者使用自动提交(通常意味着很多数据库和/或 JDBC 驱动没有事务)?
Connection (连接): 你想 MyBatis 获得来自配置的数据源的连接还是提供你自己
Execution (执行): 你想 MyBatis 复用预处理语句和/或批量更新语句(包括插入和删除)

重载的 openSession()方法签名设置允许你选择这些可选中的任何一个组合。

SqlSession openSession()

SqlSession openSession(boolean autoCommit)

SqlSession openSession(Connection connection)

SqlSession openSession(TransactionIsolationLevel level)

SqlSession openSession(ExecutorType execType,TransactionIsolationLevel level)

SqlSession openSession(ExecutorType execType)

SqlSession openSession(ExecutorType execType, boolean autoCommit)

SqlSession openSession(ExecutorType execType, Connection connection)

Configuration getConfiguration();

官方提供的openSession方法

因此出来了局部批处理第一套代码实现方式：

public static void sqlSession(List<Student> data) throws IOException {

        String resource = "mybatis-dataSource.xml";

        InputStream inputStream = null;

        SqlSession batchSqlSession = null;

        try{

            inputStream = Resources.getResourceAsStream(resource);

            SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(inputStream);

            batchSqlSession = sqlSessionFactory.openSession(ExecutorType.BATCH, false);

            int batchCount = 500;//每批commit的个数

            for(int index = 0; index < data.size();index++){

                Student stu = data.get(index);

                batchSqlSession.getMapper(Student.class).insert(stu);

                if(index !=0 && index%batchCount == 0){

                    batchSqlSession.commit();

                }

            }

            batchSqlSession.commit();

        }catch (Exception e){

            e.printStackTrace();

        }finally {

            if(batchSqlSession != null){

                batchSqlSession.close();

            }

            if(inputStream != null){

                inputStream.close();

            }

        }

    }

sqlSession(List data)

<?xml version="1.0" encoding="UTF-8" ?>

<!DOCTYPE configuration

  PUBLIC "-//mybatis.org//DTD Config 3.0//EN"

  "http://mybatis.org/dtd/mybatis-3-config.dtd">

<configuration>

  <environments default="development">

    <environment id="development">

      <transactionManager type="JDBC"/>

      <dataSource type="POOLED">

        <property name="driver" value="${driver}"/>

        <property name="url" value="${url}"/>

        <property name="username" value="${username}"/>

        <property name="password" value="${password}"/>

      </dataSource>

    </environment>

  </environments>

  <mappers>

    <mapper resource="org/mybatis/example/Student.xml"/>

  </mappers>

</configuration>

mybatis-dataSource.xml

已经在Spring-Mybatis.xml 中配置了SQLSessionFactory，那我为何还要自己去创建SQLSessionFactory呢？因此继续改良代码

public static void mybatisSqlSession(List<Student> data){

        DefaultSqlSessionFactory sqlSessionFactory = (DefaultSqlSessionFactory) ServiceBeanConstant.CTX.getBean("sqlSessionFactory");

        SqlSession batchSqlSession = null;

        try{

            batchSqlSession = sqlSessionFactory.openSession(ExecutorType.BATCH, false);

            int batchCount = 500;//每批commit的个数

            for(int index = 0; index < data.size();index++){

                Student stu = data.get(index);

                batchSqlSession.getMapper(StudentMapper.class).insert(stu);

                if(index !=0 && index%batchCount == 0){

                    batchSqlSession.commit();

                }

            }

            batchSqlSession.commit();

        }catch (Exception e){

            e.printStackTrace();

        }finally {

            if(batchSqlSession != null){

                batchSqlSession.close();

            }

        }

    }

mybatisSqlSession(List data)

这个版本的局部批处理插入是比较满意的，最终采用的方式也是这个版本。

下面放出在IService接口定义和Service的具体实现代码：

IService接口定义

/**

     * 批处理插入数据(方法内部定义500条为一个批次进行提交)

     * 使用注意事项：必须在XxxMappper.xml中实现<insert id="insert" ...>....<insert/>的sql

     * @param data 批量插入的数据

     * @param mClass 调用的XxxMaperr.class

     * @auth robin

     * Created on 2016/3/14

     */

    void saveBatch(List<T> data,Class mClass);

saveBatch(List data,Class mClass)

Service实现

@Override

    public void saveBatch(List<T> data,Class mClass) {

        DefaultSqlSessionFactory sqlSessionFactory = (DefaultSqlSessionFactory) ServiceBeanConstant.CTX.getBean("sqlSessionFactory");

        SqlSession batchSqlSession = null;

        try{

            batchSqlSession = sqlSessionFactory.openSession(ExecutorType.BATCH, false);

            int batchCount = 500;//每批commit的个数

            for(int index = 0; index < data.size();index++){

                T t = data.get(index);

                ((BaseMapper)batchSqlSession.getMapper(mClass)).insert(t);

                if(index !=0 && index%batchCount == 0){

                    batchSqlSession.commit();

                }

            }

            batchSqlSession.commit();

        }catch (Exception e){

            e.printStackTrace();

        }finally {

            if(batchSqlSession != null){

                batchSqlSession.close();

            }

        }

    }

saveBatch(List data,Class mClass)

局部和全局批处理插入对比：局部批处理，可以对特定一类的方法，进行数据批处理，不会影响其他DML语句，其他DML语句，可以正常返回影响DB_TABLE的行数。

!这样既能针对特殊需求(批处理)支持，也能支持未来需要返回影响数据行的要求。

注意：使用批处理方式进行DML操作，是无法反馈影响DB_TABLE行数的数据。无论是局部批处理还是java自带的批处理方式，皆无法反馈DB_TABLE count。

补充完善：

在我的Service实现中，通过注入的方式，获取mapper的实例

public class BaseService<MAPPER extends BaseMapper, T, PK extends Serializable> implements IBaseService<T, PK> {

    protected T tt;

    /**

     * 实体操作的自动注入Mapper(随初始化一同注入，必须用set方法)

     */

    protected MAPPER mapper;

    public MAPPER getMapper() {

        return mapper;

    }

    @Autowired

    public void setMapper(MAPPER mapper) {

        this.mapper = mapper;

    }

   //后续代码略

}

Service

前面的Service saveBatch方法中，还需要传入指定的Mapper.class.对本项目其他开发者来说，与之前的环境相比，多传一个参数感觉别扭。

那么为何我不继续封装，外部无需传入Mapper.class，而是通过内部注入的mapper实例获取Mapper.class.
改良后的代码：

@Override

    public T saveBatch(List<T> data) {

        T back = null;

        DefaultSqlSessionFactory sqlSessionFactory = (DefaultSqlSessionFactory) ServiceBeanConstant.CTX.getBean("sqlSessionFactory");

        SqlSession batchSqlSession = null;

        try{

            batchSqlSession = sqlSessionFactory.openSession(ExecutorType.BATCH, false);

            int batchCount = 500;//每批commit的个数

            for(int index = 0; index < data.size();index++){

                T t = data.get(index);

                back = t;

                Class<?>[] interfaces=mapper.getClass().getInterfaces();

                Class clazz = null;

                for (int i=0;i<interfaces.length;i++){

                    if(BaseMapper.class.isAssignableFrom(interfaces[i])){

                        clazz = interfaces[i];

                    }

                }

                if(clazz == null){

                    throw new Exception("user-defined exception:mapper not implements interfaces com.company.dao.mapper.BaseMapper");

                }

                BaseMapper baseMapper = (BaseMapper) batchSqlSession.getMapper(clazz);

                baseMapper.insert(t);

                if(index !=0 && index%batchCount == 0){

                    batchSqlSession.commit();

                }

            }

            batchSqlSession.commit();

        }catch (Exception e){

            e.printStackTrace();

        }finally {

            if(batchSqlSession != null){

                batchSqlSession.close();

            }

            return back;

        }

    }

saveBatch(List data)

这里对mapper实例进行一个简短的说明：

1.mapper实例是通过java动态代理来实例化的；

2.mapper的SQLSession是使用mybatis统一的配置实例的；

3.mapper的默认执行器是SIMPLE（普通的执行器）;

-------------------------------------------------------------------------------------

三、Mybatis foreach批量插入

Mybatis foreach 批量插入，如果批量插入的数据量大，不得不说这真是一个非常糟糕的做法。

无论是SELECT ** UNION ALL 还是BEGIN ...;END; ，相对而言后者比前者稍微好点。

放出DB和我测试的结果：

耗时	占当时整个数据库CPU百分比	说明
15.5	98.33	union all方式拼接插入
16.4	97.75	begin end方式插入块
1.54	64.81	java 自带的batch方式插入

①foreach union all的批量插入，现已有大量的博客资源可供参考，我就不贴出自己的实现方式了。

如果有兴趣可以参阅：http://blog.csdn.net/sanyuesan0000/article/details/19998727

这篇博客。BEGIN END的方式，也是从这篇博客中得到启发。只不过他是把BEGIN END用在update中。

②foreach begin end 语句块

我的实现：

<insert id="insertBatch" parameterType="java.util.List">

           BEGIN

           <foreach collection="list" item="item" index="index" separator=";" >

               INSERT INTO TABLE.STUDENT (ID,AGE,NAME,STU_ID) VALUES

               ( DEMO.SEQ_EID.NEXTVAL,#{item.age},#{item.name},#{item.stuId} )

           </foreach>

           ;END ;

       </insert>

insertBatch

调用方式：

@Override

    public void saveBatch(List<T> list) {

        int size = list.size();

        int unitNum = 500;

        int startIndex = 0;

        int endIndex = 0;

        while (size > 0){

            if(size > unitNum){

                endIndex = startIndex+unitNum;

            }else {

                endIndex = startIndex+size;

            }

            List<T> insertData = list.subList(startIndex,endIndex);

            mapper.insertBatch(insertData);

            size = size - unitNum;

            startIndex = endIndex;

        }

saveBatch(List list)

四、java自带的批处理方式

废话不多说，直接上代码

package DB;

import base.Student;

import java.sql.Connection;

import java.sql.DriverManager;

import java.sql.PreparedStatement;

import java.sql.SQLException;

import java.util.ArrayList;

import java.util.List;

/**

 * Created by robin on 2016/5/23.

 *

 * @author robin

 */

public class InsertTableDemo {

    public static void main(String args[]) throws SQLException {

        Connection connection = null;

        List<Student> dataList = getDataList(100000);

        long startTime = 0;

        try{

            connection = getConn();

            startTime=System.currentTimeMillis();

            connection.setAutoCommit(false);

            PreparedStatement statement = connection.prepareStatement("INSERT INTO STUDENT (ID,AGE,NAME,STU_ID) VALUES ( DEMO.SEQ_EID.NEXTVAL, ?,?,? ) ");

            int num = 0;

            for (int i = 0;i< dataList.size();i++){

                Student s = dataList.get(i);

                statement.setInt(1, s.getAge());

                statement.setString(2, s.getName());

                statement.setString(3, s.getStuId());

                statement.addBatch();

                num++;

                if(num !=0 && num%500 == 0){

                    statement.executeBatch();

                    connection.commit();

                    num = 0;

                }

            }

            statement.executeBatch();

            connection.commit();

        }catch (Exception e){

            e.printStackTrace();

            connection.rollback();

        }finally {

            if(connection != null){

                connection.close();

            }

            long endTime=System.currentTimeMillis();

            System.out.println("方法执行时间："+(endTime-startTime)+"ms");

        }

    }

    public static Connection getConn(){

        String driver = "oracle.jdbc.driver.OracleDriver";

        String url = "jdbc:oracle:thin:@//ip:port/DMEO"; //DMEO为数据库名

        String user = "user";

        String password = "pwd";

        try{

            Class.forName(driver);

            Connection conn = DriverManager.getConnection(url, user, password);

            return conn;

        } catch (ClassNotFoundException e) {

            e.printStackTrace();

        } catch (SQLException e) {

            e.printStackTrace();

        }

        return null;

    }

    public static List<Student> getDataList(int f){

        List<Student> data = new ArrayList<>();

        for (int i =0;i<f;i++){

            Student s = new Student(""+i,"小明" + i,i);

            data.add(s);

        }

        return data;

    }

}

JDBC BATCH

这种批量插入大量数据的方式，性能上最好。但是因为我们小组代码管理所限制，因此这种方式不使用。

五、其他方式

现在已经忘了，其他方式到底使用过哪些，但总归是比以上四种效果都更差，所以没什么印象了。

分批插入数据基于mybatis的更多相关文章

批量插入数据(基于Mybatis的实现-Oracle)
前言:做一个数据同步项目,要求:同步数据不丢失的情况下,提高插入性能. 项目DB框架:Mybatis.DataBase:Oracle. -------------------------------- ...
MyBatis中批量插入数据对插入记录数的限制
<基于Mybatis框架的批量数据插入的性能问题的探讨>(作者:魏静敏刘欢杰来源:<计算机光盘软件与应用> 2013 年第 19 期)中提到批量插入的记录数不能超过1000 ...
在MyBatis中查询数据、涉及多参数的数据访问操作、插入数据时获取数据自增长的id、关联表查询操作、动态SQL、关于配置MyBatis映射没有代码提示的解决方案
1. 单元测试在单元测试中,每个测试方法都需要执行相同的前置代码和后置代码,则可以自定义2个方法,分别在这2个方法中执行前置代码和后置代码,并为这2个方法添加@Before和@After注解,然后, ...
MyBatis 批量插入数据对插入记录数的限制
<基于 MyBatis 框架的批量数据插入的性能问题的探讨>(作者:魏静敏刘欢杰来源:<计算机光盘软件与应用> 2013 年第 19 期)中提到批量插入的记录数不能超过10 ...
MyBatis 批量插入数据的 3 种方法！
批量插入功能是我们日常工作中比较常见的业务功能之一,之前我也写过一篇关于<MyBatis Plus 批量数据插入功能,yyds!>的文章,但评论区的反馈不是很好,主要有两个问题:第一,对 ...
Mybatis + Mysql 插入数据时中文乱码问题
近日跟朋友一起建立一个项目,用的是spring+mybatis+mysql. 今天碰到一个mybatis向mysql中插入数据时,中文显示为'???'的问题,拿出来说下. 对于数据库操作中出现的中文乱 ...
mybatis 插入数据时返回主键
在使用MyBatis做持久层时,insert语句默认是不返回记录的主键值,而是返回插入的记录条数:显然,假如主键是你生成后插入的,自然你已经有主键了,显然不需要我们再去获得,所以我们这里处理的是当主键 ...
mybatis批量插入数据到oracle
mybatis 批量插入数据到oracle报 ”java.sql.SQLException: ORA-00933: SQL 命令未正确结束“ 错误解决方法 oracle批量插入使用 insert a ...
MyBatis框架——mybatis插入数据返回主键（mysql、oracle）
向数据库中插入数据时,大多数情况都会使用自增列或者UUID做为主键.主键的值都是插入之前无法知道的,但很多情况下我们在插入数据后需要使用刚刚插入数据的主键,比如向两张关联表A.B中插入数据(A的主键是 ...

随机推荐

windows切换窗口和网页快捷键
alt+tab 切换窗口win+D 显示桌面,再按一下返回运来的网页win+M 所有程序最小化网页之间切换(我用的是360) ctrl + tab 往回切 ctrl + shift +tab
在vCenter上创建新用户（适用版本6.0）
LeetCode（力扣）——Search in Rotated Sorted Array2 搜索旋转排序数组 python实现
题目描述: python实现 Search in Rotated Sorted Array2 搜索旋转排序数组中文: 假设按照升序排序的数组在预先未知的某个点上进行了旋转. ( 例如,数组 [0 ...
HTML5 表单提交实例
html <!DOCTYPE html> <html> <head> <title>表单</title> <meta charset= ...
Flutter中的日期、格式化日期、日期选择器组件
Flutter中的日期和时间戳 //獲取當前日期 DateTime _nowDate = DateTime.now(); print(_nowDate);//2019-10-29 10:57:20.3 ...
hive之压缩
对数据进行压缩可以节约磁盘空间,提高系统吞吐量和性能,但是压缩和解压缩会增加CPU的开销. 1.hive的压缩编/解码器 BZip2和GZip压缩率高,但是需要消耗较多的CPU开销.LZO和Snapp ...
进程管理工具uptime,top,htop
进程管理工具uptime,top,htop 一uptime 显示当前时间,系统已启动的时间.当前上线人数,系统平均负载(1.5.10分钟的平均负载,一般不会超过1) 系统平均负载:指在特定时间间隔内运 ...
flask01
-python中的web框架 -a:socket服务端 b:路由转发 c:模板渲染 -Django:a:用了别人的 b,c自己写的 -Flask:a:用了别人的 b自己写的,c:用了别人的: jinj ...
MySQL索引优化之双表示例
select * from tableA a left join tableB b on a.f_id = b.id; 索引建tableB表上面, 因为left join 注定左表全都有,所以应该关心 ...
iOS设计模式之工厂模式
一,什么是工厂模式模式定义: “专门定义一个类来负责创建其他类的实例,被创建的实例通常具有共同的父类.” 世界上就是由一个工厂类,根据传入的参数,动态地决定创建出哪一个产品类的实例. 需求场景: 简 ...

分批插入数据基于mybatis

SqlSessionFactory

分批插入数据基于mybatis的更多相关文章

随机推荐

热门专题