觉得很不错,就转载了, 作者: Paul Lin

首先贴一段Apache commons IO官网上的介绍,来对这个著名的开源包有一个基本的了解:
Commons IO is a library of utilities to assist with developing IO functionality. There are four main areas included:
●Utility classes - with static methods to perform common tasks ●Filters - various implementations of file filters
●Comparators - various implementations of java.util.Comparator for files ●Streams - useful stream, reader and writer implementations







org.apache.commons.io This package defines utility classes for working with streams, readers, writers and files.
org.apache.commons.io.comparator This package provides various Comparator implementations for Files.
org.apache.commons.io.filefilter This package defines an interface (IOFileFilter) that combines both FileFilter and FilenameFilter.
org.apache.commons.io.input This package provides implementations of input classes, such as InputStream and Reader.
org.apache.commons.io.output This

这个包针对SUN JDK IO包进行了扩展,实现了一些功能简单的IO类,主要包括了对字节/字符输入流接口的实现

附件: 你需要登录才可以下载或查看附件。没有帐号? 注册

Proxy stream that closes and discards the underlying stream as soon as the end of input has been reached or when the stream is explicitly closed. Not even a reference to the underlying stream is kept after it has been closed, so any allocated in-memory buffers can be freed even if the client application still keeps a reference to the proxy stream
This class is typically used to release any resources related to an open stream as soon as possible even if the client application (by not explicitly closing the stream when no longer needed) or the underlying stream (by not releasing resources once the last byte has been read) do not do that.

  1. new BufferedInputStream(new FileInputStream(FILE))


里面的FileInputStream(FILE)在打开后不能被显式关闭,这将导致可能出现的问题。如果我们使用了 AutoCloseInputStream,那么当数据读取完毕后,底层的输入流会被自动关闭,迅速地释放资源。

  1. new BufferedInputStream(new AutoClosedInputStream(new FileInputStream));



  1. package org.apache.commons.io.input;
  2. import java.io.IOException;
  3. import java.io.InputStream;
  4. public class AutoCloseInputStream extends ProxyInputStream {
  5. public AutoCloseInputStream(InputStream in) {
  6. super(in);
  7. }
  8. public void close() throws IOException {
  9. in.close();
  10. in = new ClosedInputStream();
  11. }
  12. public int read() throws IOException {
  13. int n = in.read();
  14. if (n == -1) {
  15. close();
  16. }
  17. return n;
  18. }
  19. public int read(byte[] b) throws IOException {
  20. int n = in.read(b);
  21. if (n == -1) {
  22. close();
  23. }
  24. return n;
  25. }
  26. public int read(byte[] b, int off, int len) throws IOException {
  27. int n = in.read(b, off, len);
  28. if (n == -1) {
  29. close();
  30. }
  31. return n;
  32. }
  33. protected void finalize() throws Throwable {
  34. close();
  35. super.finalize();
  36. }
  37. }
  38. public class ClosedInputStream extends InputStream {
  39. /** *//**
  40. * A singleton.
  41. */
  42. public static final ClosedInputStream CLOSED_INPUT_STREAM = new ClosedInputStream();
  43. /** *//**
  44. * Returns -1 to indicate that the stream is closed.
  45. *
  46. * @return always -1
  47. */
  48. public int read() {
  49. return -1;
  50. }
  51. }


可以看到这个类通过两个途径来保证底层的流能够被正确地关闭: ①每次调用read方法时,如果底层读到的是-1,立即关闭底层输入流。返回一个ClosedInputStream ②当这个类的对象被回收时,确保关闭底层的输入流
InputStream proxy that transparently writes a copy of all bytes read from the proxied stream to a given OutputStream. The proxied input stream is closed when the close() method is called on this proxy. It is configurable whether the associated output stream will also closed.

  1. public int read(byte[] bts, int st, int end) throws IOException {
  2. int n = super.read(bts, st, end);
  3. if (n != -1) {
  4. branch.write(bts, st, n);
  5. }
  6. return n;
  7. }



1320699 元
16939.62 点
0 分



  • 组别管理员
  • 性别
  • 积分3026078
  • 帖子3638
  • 注册时间 2007-04-09
[发布日期: 2010-03-08 22:02] [只看楼主] 真皮板凳

字体大小: t T
Reader implementation that can read from String, StringBuffer, StringBuilder or CharBuffer.
A functional, light weight Reader that emulates a reader of a specified size.
This implementation provides a light weight object for testing with an Reader where the contents don't matter.
One use case would be for testing the handling of large Reader as it can emulate that scenario without the overhead of actually processing large numbers of characters - significantly speeding up test execution times.

  1. /** *//**
  2. * Read the specified number characters into an array.
  3. *
  4. * @param chars The character array to read into.
  5. * @param offset The offset to start reading characters into.
  6. * @param length The number of characters to read.
  7. * @return The number of characters read or <code>-1</code>
  8. * if the end of file has been reached and
  9. * <code>throwEofException</code> is set to <code>false</code>.
  10. * @throws EOFException if the end of file is reached and
  11. * <code>throwEofException</code> is set to <code>true</code>.
  12. * @throws IOException if trying to read past the end of file.
  13. */
  14. public int read(char[] chars, int offset, int length) throws IOException {
  15. if (eof) {
  16. throw new IOException("Read after end of file");
  17. }
  18. if (position == size) {
  19. return doEndOfFile();
  20. }
  21. position += length;
  22. int returnLength = length;
  23. if (position > size) {
  24. returnLength = length - (int)(position - size);
  25. position = size;
  26. }
  27. processChars(chars, offset, returnLength);
  28. return returnLength;
  29. }
  30. /** *//**
  31. * Return a character value for the  <code>read()</code> method.
  32. * <p>
  33. * This implementation returns zero.
  34. *
  35. * @return This implementation always returns zero.
  36. */
  37. protected int processChar() {
  38. // do nothing - overridable by subclass
  39. return 0;
  40. }
  41. /** *//**
  42. * Process the characters for the <code>read(char[], offset, length)</code>
  43. * method.
  44. * <p>
  45. * This implementation leaves the character array unchanged.
  46. *
  47. * @param chars The character array
  48. * @param offset The offset to start at.
  49. * @param length The number of characters.
  50. */
  51. protected void processChars(char[] chars, int offset, int length) {
  52. // do nothing - overridable by subclass
  53. }




1320699 元
16939.62 点
0 分



  • 组别管理员
  • 性别
  • 积分3026078
  • 帖子3638
  • 注册时间 2007-04-09
[发布日期: 2010-03-08 22:02] [只看楼主] 真皮地板

字体大小: t T

附件: 你需要登录才可以下载或查看附件。没有帐号? 注册

和input包类似,output包也实现/继承了部分JDK IO包的类、接口。这里需要特别注意的有3个类,他们分别是:
①ByteArrayOutputStream ②FileWriterWithEncoding ③LockableFileWriter
This class implements an output stream in which the data is written into a byte array. The buffer automatically grows as data is written to it.
The data can be retrieved using toByteArray() and toString().
Closing a ByteArrayOutputStream has no effect. The methods in this class can be called after the stream has been closed without generating an IOException.
This is an alternative implementation of the java.io.ByteArrayOutputStream class. The original implementation only allocates 32 bytes at the beginning. As this class is designed for heavy duty it starts at 1024 bytes. In contrast to the original it doesn't reallocate the whole memory block but allocates additional buffers. This way no buffers need to be garbage collected and the contents don't have to be copied to the new buffer. This class is designed to behave exactly like the original. The only exception is the deprecated toString(int) method that has been ignored.
从上面的文档中,我们看到Apache commons io的ByteArrayOutputString比起SUN自带的ByteArrayOutputStream更加高效,原因在于:
①缓冲区的初始化大小比原始的JDK自带的ByteArrayOutputStream要大很多(1024:32) ②缓冲区的大小可以无限增加。当缓冲不够时动态增加分配,而非清空后再重新封闭 ③减少write方法的调用次数,一次性将多个一级缓冲数据写出。减少堆栈调用的时间

  1. /** *//** The list of buffers, which grows and never reduces. */
  2. private List buffers = new ArrayList();
  3. /** *//** The current buffer. */
  4. private byte[] currentBuffer;


  1. /** *//**
  2. * Creates a new byte array output stream. The buffer capacity is
  3. * initially 1024 bytes, though its size increases if necessary.
  4. */
  5. public ByteArrayOutputStream() {
  6. this(1024);
  7. }



  1. /** *//**
  2. * The buffer where data is stored.
  3. */
  4. protected byte buf[];


  1. /** *//**
  2. * Creates a new byte array output stream. The buffer capacity is
  3. * initially 32 bytes, though its size increases if necessary.
  4. */
  5. public ByteArrayOutputStream() {
  6. this(32);
  7. }


原来Apache commons 的io是采用了二级缓冲:首先一级缓冲是一个byte[],随着每次写出的数据不同而不同。二级缓冲则是一个无限扩充的ArrayList,每次从 byte[]中要写出的数据都会缓存到这里。当然效率上要高很多了。那么这个类是如何做到动态增加缓冲而不需要每次都回收已有的缓冲呢?

  1. /** *//**
  2. * Makes a new buffer available either by allocating
  3. * a new one or re-cycling an existing one.
  4. *
  5. * @param newcount  the size of the buffer if one is created
  6. */
  7. private void needNewBuffer(int newcount) {
  8. if (currentBufferIndex < buffers.size() - 1) {
  9. //Recycling old buffer
  10. filledBufferSum += currentBuffer.length;
  11. currentBufferIndex++;
  12. currentBuffer = getBuffer(currentBufferIndex);
  13. } else {
  14. //Creating new buffer
  15. int newBufferSize;
  16. if (currentBuffer == null) {
  17. newBufferSize = newcount;
  18. filledBufferSum = 0;
  19. } else {
  20. newBufferSize = Math.max(
  21. currentBuffer.length << 1,
  22. newcount - filledBufferSum);
  23. filledBufferSum += currentBuffer.length;
  24. }
  25. currentBufferIndex++;
  26. currentBuffer = new byte[newBufferSize];
  27. buffers.add(currentBuffer);
  28. }
  29. }


在初始化的情况下,currentBuffer == null,于是第一个一级缓冲区byte[]的大小就是默认的1024或者用户指定的值。然后filledBufferSum、 currentBufferIndex分别进行初始化。创建第一个一级缓存区,添加到二级缓冲区buffers中。


1320699 元
16939.62 点
0 分



  • 组别管理员
  • 性别
  • 积分3026078
  • 帖子3638
  • 注册时间 2007-04-09
[发布日期: 2010-03-08 22:03] [只看楼主] 5#

字体大小: t T
相比于JDK自带的方法,这个类多了一个write(InputStream in)的方法,看看下面的源代码

  1. public synchronized int write(InputStream in) throws IOException {
  2. int readCount = 0;
  3. int inBufferPos = count - filledBufferSum;
  4. int n = in.read(currentBuffer, inBufferPos, currentBuffer.length - inBufferPos);
  5. while (n != -1) {
  6. readCount += n;
  7. inBufferPos += n;
  8. count += n;
  9. if (inBufferPos == currentBuffer.length) {
  10. needNewBuffer(currentBuffer.length);
  11. inBufferPos = 0;
  12. }
  13. n = in.read(currentBuffer, inBufferPos, currentBuffer.length - inBufferPos);
  14. }
  15. return readCount;
  16. }



  1. public synchronized void writeTo(OutputStream out) throws IOException {
  2. int remaining = count;
  3. for (int i = 0; i < buffers.size(); i++) {
  4. byte[] buf = getBuffer(i);
  5. int c = Math.min(buf.length, remaining);
  6. out.write(buf, 0, c);
  7. remaining -= c;
  8. if (remaining == 0) {
  9. break;
  10. }
  11. }
  12. }



  1. OutputStream stream = null;
  2. Writer writer = null;
  3. try {
  4. stream = new FileOutputStream(file, append);
  5. if (encoding instanceof Charset) {
  6. writer = new OutputStreamWriter(stream, (Charset)encoding);
  7. } else if (encoding instanceof CharsetEncoder) {
  8. writer = new OutputStreamWriter(stream, (CharsetEncoder)encoding);
  9. } else {
  10. writer = new OutputStreamWriter(stream, (String)encoding);
  11. }



  1. /** *//**
  2. * Constructs a LockableFileWriter with a file encoding.
  3. *
  4. * @param file  the file to write to, not null
  5. * @param encoding  the encoding to use, null means platform default
  6. * @param append  true if content should be appended, false to overwrite
  7. * @param lockDir  the directory in which the lock file should be held
  8. * @throws NullPointerException if the file is null
  9. * @throws IOException in case of an I/O error
  10. */
  11. public LockableFileWriter(File file, String encoding, boolean append,
  12. String lockDir) throws IOException {
  13. super();
  14. // init file to create/append
  15. file = file.getAbsoluteFile();
  16. if (file.getParentFile() != null) {
  17. FileUtils.forceMkdir(file.getParentFile());
  18. }
  19. if (file.isDirectory()) {
  20. throw new IOException("File specified is a directory");
  21. }
  22. // init lock file
  23. if (lockDir == null) {
  24. lockDir = System.getProperty("java.io.tmpdir");
  25. }
  26. File lockDirFile = new File(lockDir);
  27. FileUtils.forceMkdir(lockDirFile);
  28. testLockDir(lockDirFile);
  29. lockFile = new File(lockDirFile, file.getName() + LCK);
  30. // check if locked
  31. createLock();
  32. // init wrapped writer
  33. out = initWriter(file, encoding, append);
  34. }



  1. private void createLock() throws IOException {
  2. synchronized (LockableFileWriter.class) {
  3. if (!lockFile.createNewFile()) {
  4. throw new IOException("Can't write file, lock " +
  5. lockFile.getAbsolutePath() + " exists");
  6. }
  7. lockFile.deleteOnExit();
  8. }
  9. }


下面的问题则是如何实现锁呢?呵呵~~。还是回到这个上面这个类的构造方法吧,我们看到在构造这个LockableFileWriter时,会调用 createLock()这个方法,而这个方法如果发现文件已经创建/被其它流引用时,会抛出一个IOException。于是创建不成功,也就无法继续后续的write操作了。

  1. public void close() throws IOException {
  2. try {
  3. out.close();
  4. } finally {
  5. lockFile.delete();
  6. }
  7. }


每个进程在完成数据的写动作后,必须调用close()方法,于是锁文件被删除,锁被解除。相比于JDK中自带Writer使用的object锁 (synchronized(object)),这个方法确实要更加简便和高效。这个类当初就是被设计来替换掉原始的FileWriter的。

IO与文件读写---使用Apache commons IO包提高读写效率的更多相关文章

  1. apache commons io包基本功能

    1. http://jackyrong.iteye.com/blog/2153812 2. http://www.javacodegeeks.com/2014/10/apache-commons-io ...

  2. Apache Commons IO入门教程(转)

    Apache Commons IO是Apache基金会创建并维护的Java函数库.它提供了许多类使得开发者的常见任务变得简单,同时减少重复(boiler-plate)代码,这些代码可能遍布于每个独立的 ...

  3. [转]Apache Commons IO入门教程

    Apache Commons IO是Apache基金会创建并维护的Java函数库.它提供了许多类使得开发者的常见任务变得简单,同时减少重复(boiler-plate)代码,这些代码可能遍布于每个独立的 ...

  4. apache commons io入门

    原文参考  http://www.javacodegeeks.com/2014/10/apache-commons-io-tutorial.html    Apache Commons IO 包绝对是 ...

  5. Java (三)APACHE Commons IO 常规操作

    上一篇:Java (二)基于Eclipse配置Commons IO的环境 例1:查看文件.文件夹的长度(大小). 1 import java.io.File; 2 3 import org.apach ...

  6. 使用Apache Commons IO组件读取大文件

    Apache Commons IO读取文件代码如下: Files.readLines(new File(path), Charsets.UTF_8); FileUtils.readLines(new ...

  7. Java (四)APACHE Commons IO 复制文件

    上一篇:Java (三)APACHE Commons IO 常规操作 例1:复制文件 1 import java.io.File; 2 import java.io.IOException; 3 4 ...

  8. apache.commons.io.FileUtils的常用操作

    至于相关jar包可以到官网获取 http://commons.apache.org/downloads/index.html package com.wz.apache.fileUtils; impo ...

  9. Caused by: java.lang.ClassNotFoundException: org.apache.commons.io.FileUtils

    1.错误叙述性说明 警告: Could not create JarEntryRevision for [jar:file:/D:/MyEclipse/apache-tomcat-7.0.53/web ...


  1. 一个tabBarController管理多个Storyboard

    随着项目的业务逻辑越来越复杂,随着项目越来越大,那么我们Storybard中得控制器就越来越多, 就越来越难以维护.然而使用Storyborad又能更方便的帮助我们做屏幕适配(PS:尤其在6.6+出来 ...

  2. cocos2d-x for wp8 设置横竖屏

    在主project文件(xxx.cpp , xxx为你的项目名)中, 函数名为void xxx::SetWindow(CoreWindow^ window) 相关代码片例如以下: <pre na ...

  3. Sql Server 列转行 Pivot使用

    今天正好做 数据展示,用到了列转行,行转列有多种方式,Pivot是其中的一种,Povit 是sql server 2005以后才出现的功能, 下面的业务场景: 每个月,进货渠道的总计数量[Total] ...

  4. TCP应用编程--套接字C#实现

     套接字之间的连接过程可以分为三个步骤: 1.服务器监听 2.客户端请求 3.连接确认 Ø服务器监听:是指服务器套接字并不定位具体的客户端套接字,而 是处于等待连接的状态,实时监控网络状态. Ø客户端 ...

  5. Spring IOC的描述和Spring的注解(转)

    Spring常用的注解 本文系转载:转载网址: http://www.cnblogs.com/xdp-gacl/p/3495887.html http://ljhzzyx.blog.163.com/b ...

  6. oracle ORA-00913: 值过多

    --oracle中查看表是否被锁 查看表是否被锁   SELECT /*+ rule*/   a.sid, b.owner, object_name, object_type   FROM v$loc ...

  7. [C++空间分配]new运算符、operator new、placement new的区别于联系

    先科普一下: 1. new的执行过程: (1)通过operator new申请内存 (2)使用placement new调用构造函数(内置类型忽略此步) (3)返回内存指针 2. new和malloc ...

  8. ubuntu12.04常见错误总结

    1.通过终端安装程序sudo apt-get install xxx时出错: E: Could not get lock /var/lib/dpkg/lock - open (11: Resource ...

  9. Linux网络管理——Linux网络命令

    3. Linux网络命令 .note-content {font-family: "Helvetica Neue",Arial,"Hiragino Sans GB&quo ...

  10. c++中多态性、dynamic_cast、父类指针、父类对象、子类指针、子类对象

    c++多态性是依靠虚函数和父类指针指向子类对象来实现的.简单来说,父类中定义虚函数,父类指针指向子类对象,父类指针调用函数时调用的就是子类的函数. 父类没有定义虚函数,父类指针指向子类对象时,父类指针 ...