How to improve Java's I/O performance（提升 java i/o 性能）

原文：http://www.javaworld.com/article/2077523/build-ci-sdlc/java-tip-26--how-to-improve-java-s-i-o-performance.html

JDK 1.0.2 的 java.io 包暴露了非常多I/O性能问题。这里将介绍一个优化方案，附加一个关闭同步的方法。

Java的I/O性能以前是非常多Java应用的瓶颈。主要原因就是JDK1.0.2的java.io包的不良设计和实现。关键问题是缓冲。绝大多数java.io中的类都未做缓冲。其实，仅仅有BufferedInputStream 和 BufferedOutputStream两个类做了缓冲。但他们提供的方法有限。比如。在大多数涉及文件操作的应用中，你须要逐行解析一个文件。但是唯一提供了readLine方法的类是DataInputStream。但是它却没有内部缓冲。DataInputStream的readLine方法其实是从输入流中逐个读取字符直到遇到
“n” 或 “rn”字符。每一个读取字符操作都涉及到一次文件I/O。这在读取一个大文件时是极其低效的。没有缓冲的情况下一个5兆字节的文件就须要至少5百万次读取字符的文件I/O操作。

新版本号JDK1.1通过添加一套Reader、Writer类改进了I/O性能。在大文件读取中BufferedReader的readLine方法至少比曾经的DataInputStream快10到20倍。

不幸的是，JDK1.1没有解决所有的性能问题。比方，当你想解析一个大文件可是又不希望所有读到内存中时，须要使用到RandomAccessFile类。可是在JDK1.1里它也没有做缓冲，也没有提供其它类似的Reader类。

How to tackle the I/O problem

To tackle the problem of inefficient file I/O, we need a buffered RandomAccessFile class. A new class is derived from the RandomAccessFile class, in order to reuse all the methods in it. The new class is named Braf(Bufferedrandomaccessfile).

怎样解决I/O难题？

解决低效的文件I/O，我们须要一个提供缓冲的RandomAccessFile类。

有一个类继承自RandomAccessFile，而且重用了RandomAccessFile中的全部方法，它就是Braf(Bufferedrandomaccessfile)。

 public class Braf extends RandomAccessFile {

  }

出于效率原因，我们定义了一个字节缓冲区而不是字符缓冲区。使用buf_end、buf_pos和real_pos三个变量来记录缓冲区上实用的位置信息。

For efficiency reasons, we define a byte buffer instead of char buffer. The variables buf_end, buf_pos, and real_pos are used to record the effective positions on the buffer:

byte buffer[];

int buf_end = 0;

int buf_pos = 0;

long real_pos = 0;

添加了一个新的构造函数，里面多了一个指定缓冲区大小的參数：

A new constructor is added with an additional parameter to specify the size of the buffer:

public Braf(String filename, String mode, int bufsize)

   throws IOException{

    super(filename,mode);

    invalidate();

    BUF_SIZE = bufsize;

    buffer = new byte[BUF_SIZE];

  }

新写了一个read方法，它永远优先读取缓冲区。它覆盖了原来的read方法，在缓冲区读完时，会调用fillBuffer，它将调用父类的read方法读取字节，填充到缓冲区中。

私有函数invalidate被用来推断缓冲区中是否包括合法数据。它在seek方法被调用、文件指针可能被定位到缓冲区之外时是很有必要的。

The new read method is written such that it always reads from the buffer first. It overrides the native read method in the original class, which is never engaged until the buffer has run out of room. In that case, the fillBuffer method is called to fill
in the buffer. In fillBuffer, the original read is invoked. The private method invalidateis used to indicate that the buffer no longer contains valid contents. This is necessary when the seek method moves the file pointer out of the buffer.

public final int read() throws IOException{

    if(buf_pos >= buf_end) {

       if(fillBuffer() < 0)

       return -1;

    }

    if(buf_end == 0) {

         return -1;

    } else {

         return buffer[buf_pos++];

    }

  }

  private int fillBuffer() throws IOException {

    int n = super.read(buffer, 0, BUF_SIZE);

    if(n >= 0) {

      real_pos +=n;

      buf_end = n;

      buf_pos = 0;

    }

    return n;

  }

  private void invalidate() throws IOException {

    buf_end = 0;

    buf_pos = 0;

    real_pos = super.getFilePointer();

  }

还有一个參数化的读取方法也被重载，代码例如以下。假设缓冲足够的话。它就会调用System.arraycopy 方法直接从缓冲中拷贝一部分到用户区。这个也能显著提升性能。由于getNextLine方法中read()方法被大量使用，getNextLine也是readLine的替代品。

The other parameterized read method also is overridden. The code for the new read is listed below. If there is enough buffer, it will simply call System.arraycopy to copy a portion of the buffer directly into the user-provided area. This presents the most
significant performance gain because the read method is heavily used in the getNextLine method, which is our replacement for readLine.

public int read(byte b[], int off, int len) throws IOException {

   int leftover = buf_end - buf_pos;

   if(len <= leftover) {

             System.arraycopy(buffer, buf_pos, b, off, len);

        buf_pos += len;

        return len;

   }

   for(int i = 0; i < len; i++) {

      int c = this.read();

      if(c != -1)

         b[off+i] = (byte)c;

      else {

         return i;

      }

   }

   return len;

  }

原来的getFilePointer和seek方法也须要被重载来配合缓冲。大多数情况下。两个方法仅仅会简单的在缓冲中进行操作

The original methods getFilePointer and seek need to be overridden as well in order to take advantage of the buffer. Most of time, both methods will simply operate inside the buffer.

public long getFilePointer() throws IOException{

    long l = real_pos;

    return (l - buf_end + buf_pos) ;

  }

  public void seek(long pos) throws IOException {

    int n = (int)(real_pos - pos);

    if(n >= 0 && n <= buf_end) {

      buf_pos = buf_end - n;

    } else {

      super.seek(pos);

      invalidate();

    }

  }

最重要的。一个新的方法。getNextLine，被增加来替换readLine。

我们不能简单的重载readLine。由于它是final定义的。getNextLine方法首先须要确定buffer是否有未读数据。

假设没有，缓冲区须要被填满。

读取时假设遇到换行符，新的一行就从缓冲区中读出转换为String对象。否则，将继续调用read方法逐个读取字节。虽然后面部分的代码和原来的readLine非常像。可是由于read方法做了缓冲，它的性能也要优于曾经。

Most important, a new method, getNextLine, is added to replace the readLine method. We can not simply override the readLine method because it is defined as final in the original class. The getNextLine method first decides if the buffer still contains unread
contents. If it doesn't, the buffer needs to be filled up. If the new line delimiter can be found in the buffer, then a new line is read from the buffer and converted into String. Otherwise, it will simply call the read method to read byte by byte. Although
the code of the latter portion is similar to the original readLine, performance is better here because the read method is buffered in the new class

/**

   * return a next line in String

   */

  public final String getNextLine() throws IOException {

   String str = null;

   if(buf_end-buf_pos <= 0) {

      if(fillBuffer() < 0) {

                throw new IOException("error in filling buffer!");

      }

   }

   int lineend = -1;

   for(int i = buf_pos; i < buf_end; i++) {

        if(buffer[i] == '\n') {

         lineend = i;

          break;

          }

   }

   if(lineend < 0) {

        StringBuffer input = new StringBuffer(256);

        int c;

             while (((c = read()) != -1) && (c != '\n')) {

                 input.append((char)c);

        }

        if ((c == -1) && (input.length() == 0)) {

          return null;

        }

        return input.toString();

   }

   if(lineend > 0 && buffer[lineend-1] == '\r')

        str = new String(buffer, 0, buf_pos, lineend - buf_pos -1);

   else str = new String(buffer, 0, buf_pos, lineend - buf_pos);

   buf_pos = lineend +1;

   return str;

   }

在Braf类的帮助下，我们在逐行读取大文件时至少能得到高过RandomAccessFile类25倍的性能提升。这个方法也应用在其它I/O操作密集的场景中。

关闭同步：额外的提示

除了I/O，还有一个拖累Java性能的因素是同步，大体上，同步方法的成本大约是普通方法的6倍。假设你在写一个没有多线程的应用，或者是一个应用中肯定仅仅会单线程执行的部分。你不须要做不论什么同步声明。

当前，Java还没有机制来关闭同步。

一个非正规的方法是拿到源代码，去掉同步声明然后创建一个新类。比如。BufferedInputStream中两个read方法都是同步的，由于其它I/O方法都依赖它们。你能够在JavaSoft的JDK 1.1中拷贝BufferedInputStream.java 源代码，创建一个新的NewBIS类，删掉同步声明，又一次编译

With the new Braf class, we have experienced at least 25 times performance improvement over RandomAccessFile when a large file needs to be parsed line by line. The method described here also applies to other places where intensive file I/O operations are
involved.

Synchronization turn-off: An extra tip

Another factor responsible for slowing down Java's performance, besides the I/O problem discussed above, is the synchronized statement. Generally, the overhead of a synchronized method is about 6 times that of a conventional method. If you are writing an application
without multithreading -- or a part of an application in which you know for sure that only one thread is involved -- you don't need anything to be synchronized. Currently, there is no mechanism in Java to turn off synchronization. A simple trick is to get
the source code of a class, remove synchronized statements, and generate a new class. For example, in BufferedInputStream, both read methods are synchronized, whereas all other I/O methods depend on them. You can simply rename the class to NewBIS,for example,
copy the source code from BufferedInputStream.java provided by JavaSoft's JDK 1.1, remove synchronized statements from NewBIS.java, and recompile NewBIS.

How to improve Java's I/O performance（提升 java i/o 性能）的更多相关文章

java打字游戏-一款快速提升java程序员打字速度的游戏（附源码）
一.效果如图: 源码地址:https://gitee.com/hoosson/TYPER 纯干货,别忘了留个赞哦!
java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
报错: java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Meth ...
疯狂Java学习笔记（84）----------大约 Java 对象序列化，你不知道 5 事
几年前,.当一个软件团队一起用 Java 书面申请.我认识比一般程序猿多知道一点关于 Java 对象序列化的知识所带来的优点. 关于本系列您认为自己懂 Java 编程?其实,大多数程序猿对于 Jav ...
XML概念定义以及如何定义xml文件编写约束条件java解析xml DTD XML Schema JAXP java xml解析 dom4j 解析 xpath dom sax
本文主要涉及:xml概念描述,xml的约束文件,dtd,xsd文件的定义使用,如何在xml中引用xsd文件,如何使用java解析xml,解析xml方式dom sax,dom4j解析xml文件 XML来 ...
What is the reason for - java.security.spec.InvalidKeySpecException: Unknown KeySpec type: java.security.spec.ECPublicKeySpec
支付中心Project重构完成,经过本地测试,并未发现问题.发布到测试环境后,测试发现请求光大扫码https接口时,出现了如下的异常: javax.net.ssl.SSLException: Serv ...
java面试题全集（上）--java基础
本文转载自:https://blog.csdn.net/jackfrued/article/details/44921941 1.面向对象的特征有哪些方面? 答:面向对象的特征主要有以下几个方面: - ...
java设计模式大全 Design pattern samples in Java(最经典最全的资料)
java设计模式大全 Design pattern samples in Java(最经典最全的资料) 2015年06月19日 13:10:58 阅读数:11100 Design pattern sa ...
Java Web学习总结（29）——Java Web中的Filter和Interceptor比较
1. 背景在设计web应用的时候,用户登录/注册是必不可少的功能,对用户登录信息进行验证的方法也是多种多样,大致可以认为如下模式:前端验证+后台验证.根据笔者的经验,一般会在前端进行一些例如是否输入 ...
Java基础学习总结（48）——Java 文档注释
Java只是三种注释方式.前两种分别是// 和/* */,第三种被称作说明注释,它以/** 开始,以 */结束. 说明注释允许你在程序中嵌入关于程序的信息.你可以使用javadoc工具软件来生成信息, ...

随机推荐

shp系列（五）——利用C++进行shp文件的写（创建）
之前介绍了shp文件.dbf文件和shx文件的的读取,接下来将分别介绍它们的创建过程.一般来说,读和写的一一对应的,写出的文件就是为了保存数据供以后读取的.写的文件要符合shapefile的标准.之前 ...
expdp通过dblink远端导出
环境说明: db62是源端 rac数据库 dw03为需要导入的目标端数据库单机,实例名,服务名,字符串名都为dw03 数据库版本:11.2.0.4 操作系统:rehat 6.7 1.创建dblink ...
flash as3.0学习笔记
F9开动作模板 trace输出 trace(a); 影片剪辑 var mc:MovieClip = new MovieClip();//属性(x,y轴)方法 play,stop mc.x = 10 / ...
oracle11g安装与拆卸
Oracle 11g安装 1.解压下载的包,然后进入包内,点击setup.exe开始安装 . 2.出现如下:一般把那个小对勾取消,点击下一步进行, 弹出下图这个后点'是' 3.下图后,选择创建和配置数 ...
【Oracle】SCOPE=MEMORY|SPFILE|BOTH
SCOPE=MEMORY|SPFILE|BOTH 指示了修改参数时的“作用域”: SCOPE=MEMORY :只在实例中修改,重启数据库后此次修改失效. SCOPE=SPFILE :只修改SPFILE ...
Ext未定义问题解决
做的项目用到EXT.NET,调试时候没问题,发布到IIS上出现EXT未定义,把项目的应用程序池改为Classic 模式就可以了.
认识优动漫PAINT，优动漫PAINT基本功能有哪些？
优动漫PAINT是一款搭载了绘制漫画.插画所需所有功能的软件.拥有笔感自然真实.表现形式多样的画笔工具,及高效.完美.便捷的上色工具等. 本文将通过由优动漫PAINT描绘的作品为例,简单介绍该软件的功 ...
超实用的JavaScript代码段
1. 判断日期是否有效 JavaScript中自带的日期函数还是太过简单,很难满足真实项目中对不同日期格式进行解析和判断的需要.JQuery也有一些第三方库来使日期相关的处理变得简单,但有时你可能只需 ...
BZOJ 1303: [CQOI2009]中位数图问题转化_扫描_思维
将比 b 大的设成 1,比 b 小的设成 -1,开个桶左右扫描一下,乘法原理乘一乘就好了. 虽然一眼切,不过这个基于中位数的转化还是相当重要的.middle 那个主席树的题也需要该做法 Code: # ...
洛谷P2827 蚯蚓队列 + 观察
我们不难发现先被切开的两半一定比后被切开的两半大,这样就天然的生成了队列的单调性,就可以省去一个log.所以,我们开三个队列,分别为origin,big,smallorigin, big, small ...

How to improve Java&#39;s I/O performance（ 提升 java i/o 性能）

How to improve Java&#39;s I/O performance（ 提升 java i/o 性能）的更多相关文章

随机推荐

热门专题

How to improve Java's I/O performance（提升 java i/o 性能）

How to improve Java's I/O performance（提升 java i/o 性能）的更多相关文章