细说Lucene源码(一)：索引文件锁机制

大家都知道，在多线程或多进程的环境中，对统一资源的访问需要特别小心，特别是在写资源时，如果不加锁，将会导致很多严重的后果，Lucene的索引也是如此，lucene对索引的读写分为IndexReader和IndexWriter，顾名思义，一个读，一个写，lucene可以对同一个索引文件建立多个IndexReader对象，但是只能有一个IndexWriter对象，这是怎么做到的呢？显而易见是需要加锁的，加锁可以保证一个索引文件只能建立一个IndexWriter对象。下面就细说Lucene索引文件锁机制：

如果我们对同一个索引文件建立多个不同的IndexWriter会怎么样呢？

IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);

IndexWriter indexWriter = new IndexWriter(dir, indexWriterConfig);

IndexWriterConfig indexWriterConfig2 = new IndexWriterConfig(analyzer);

IndexWriter indexWriter2 = new IndexWriter(dir,indexWriterConfig2);

运行后，控制台输出：

Exception in thread "main" org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@C:\Users\new\Desktop\Lucene\write.lock

    at org.apache.lucene.store.Lock.obtain(Lock.java:89)

    at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:755)

    at test.Index.index(Index.java:51)

    at test.Index.main(Index.java:78)

显然是不可以对同一个索引文件开启多个IndexWriter。

上面是一个比较简略的类图，可以看到lucene采用了工厂方法，这样可以方便扩展其他实现，这里只以SimpleFsLock为例说明lucene的锁机制（其他的有兴趣可以看lucene源码）。

Lock类是锁的基类，一个抽象类，源码如下：

public abstract class Lock implements Closeable {

  /** How long {@link #obtain(long)} waits, in milliseconds,

   *  in between attempts to acquire the lock. */

  public static long LOCK_POLL_INTERVAL = 1000;

  /** Pass this value to {@link #obtain(long)} to try

   *  forever to obtain the lock. */

  public static final long LOCK_OBTAIN_WAIT_FOREVER = -1;

  /** Attempts to obtain exclusive access and immediately return

   *  upon success or failure.  Use {@link #close} to

   *  release the lock.

   * @return true iff exclusive access is obtained

   */

  public abstract boolean obtain() throws IOException;

  /**

   * If a lock obtain called, this failureReason may be set

   * with the "root cause" Exception as to why the lock was

   * not obtained.

   */

  protected Throwable failureReason;

  /** Attempts to obtain an exclusive lock within amount of

   *  time given. Polls once per {@link #LOCK_POLL_INTERVAL}

   *  (currently 1000) milliseconds until lockWaitTimeout is

   *  passed.

   * @param lockWaitTimeout length of time to wait in

   *        milliseconds or {@link

   *        #LOCK_OBTAIN_WAIT_FOREVER} to retry forever

   * @return true if lock was obtained

   * @throws LockObtainFailedException if lock wait times out

   * @throws IllegalArgumentException if lockWaitTimeout is

   *         out of bounds

   * @throws IOException if obtain() throws IOException

   */

  public final boolean obtain(long lockWaitTimeout) throws IOException {

    failureReason = null;

    boolean locked = obtain();

    if (lockWaitTimeout < 0 && lockWaitTimeout != LOCK_OBTAIN_WAIT_FOREVER)

      throw new IllegalArgumentException("lockWaitTimeout should be LOCK_OBTAIN_WAIT_FOREVER or a non-negative number (got " + lockWaitTimeout + ")");

    long maxSleepCount = lockWaitTimeout / LOCK_POLL_INTERVAL;

    long sleepCount = 0;

    while (!locked) {

      if (lockWaitTimeout != LOCK_OBTAIN_WAIT_FOREVER && sleepCount++ >= maxSleepCount) {

        String reason = "Lock obtain timed out: " + this.toString();

        if (failureReason != null) {

          reason += ": " + failureReason;

        }

        throw new LockObtainFailedException(reason, failureReason);

      }

      try {

        Thread.sleep(LOCK_POLL_INTERVAL);

      } catch (InterruptedException ie) {

        throw new ThreadInterruptedException(ie);

      }

      locked = obtain();

    }

    return locked;

  }

  /** Releases exclusive access. */

  public abstract void close() throws IOException;

  /** Returns true if the resource is currently locked.  Note that one must

   * still call {@link #obtain()} before using the resource. */

  public abstract boolean isLocked() throws IOException;

  /** Utility class for executing code with exclusive access. */

  public abstract static class With {

    private Lock lock;

    private long lockWaitTimeout;

    /** Constructs an executor that will grab the named lock. */

    public With(Lock lock, long lockWaitTimeout) {

      this.lock = lock;

      this.lockWaitTimeout = lockWaitTimeout;

    }

    /** Code to execute with exclusive access. */

    protected abstract Object doBody() throws IOException;

    /** Calls {@link #doBody} while <i>lock</i> is obtained.  Blocks if lock

     * cannot be obtained immediately.  Retries to obtain lock once per second

     * until it is obtained, or until it has tried ten times. Lock is released when

     * {@link #doBody} exits.

     * @throws LockObtainFailedException if lock could not

     * be obtained

     * @throws IOException if {@link Lock#obtain} throws IOException

     */

    public Object run() throws IOException {

      boolean locked = false;

      try {

         locked = lock.obtain(lockWaitTimeout);

         return doBody();

      } finally {

        if (locked) {

          lock.close();

        }

      }

    }

  }

}

里面最重要的方法就是obtain()，这个方法用来维持锁，建立锁之后，维持时间为LOCK_POLL_INTERVAL，之后需要重新申请维持锁，这样做是为了支持多线程读写。当然也可以将lockWaitTimeout设置为-1，这样就是一直维持写锁。

抽象基类LockFactory，只定义了一个抽象方法makeLock，返回Lock对象的一个实例。

public abstract class LockFactory {

  /**

   * Return a new Lock instance identified by lockName.

   * @param lockName name of the lock to be created.

   */

  public abstract Lock makeLock(Directory dir, String lockName);

}

抽象类FSLockFactory继承Lock：

public abstract class FSLockFactory extends LockFactory {

  /** Returns the default locking implementation for this platform.

   * This method currently returns always {@link NativeFSLockFactory}.

   */

  public static final FSLockFactory getDefault() {

    return NativeFSLockFactory.INSTANCE;

  }

  @Override

  public final Lock makeLock(Directory dir, String lockName) {

    if (!(dir instanceof FSDirectory)) {

      throw new UnsupportedOperationException(getClass().getSimpleName() + " can only be used with FSDirectory subclasses, got: " + dir);

    }

    return makeFSLock((FSDirectory) dir, lockName);

  }

  /** Implement this method to create a lock for a FSDirectory instance. */

  protected abstract Lock makeFSLock(FSDirectory dir, String lockName);

}

可以看到

public static final FSLockFactory getDefault() {

return NativeFSLockFactory.INSTANCE;

}

这个方法默认返回NativeFSLockFactory，和SimpleFSLockFactory一样是一个具体实现，NativeFSLockFactory使用的是nio中FileChannel.tryLock方法，这里不展开讨论，有兴趣的读者可以去看jdk nio的源码（好像现在oracle不提供FileChannel实现类的源码了，需要去jvm里找）。

下面就是本篇文章的重头戏，SimpleFSLockFactory

public final class SimpleFSLockFactory extends FSLockFactory {

  /**

   * Singleton instance

   */

  public static final SimpleFSLockFactory INSTANCE = new SimpleFSLockFactory();

  private SimpleFSLockFactory() {}

  @Override

  protected Lock makeFSLock(FSDirectory dir, String lockName) {

    return new SimpleFSLock(dir.getDirectory(), lockName);

  }

  static class SimpleFSLock extends Lock {

    Path lockFile;

    Path lockDir;

    public SimpleFSLock(Path lockDir, String lockFileName) {

      this.lockDir = lockDir;

      lockFile = lockDir.resolve(lockFileName);

    }

    @Override

    public boolean obtain() throws IOException {

      try {

        Files.createDirectories(lockDir);

        Files.createFile(lockFile);

        return true;

      } catch (IOException ioe) {

        // On Windows, on concurrent createNewFile, the 2nd process gets "access denied".

        // In that case, the lock was not aquired successfully, so return false.

        // We record the failure reason here; the obtain with timeout (usually the

        // one calling us) will use this as "root cause" if it fails to get the lock.

        failureReason = ioe;

        return false;

      }

    }

    @Override

    public void close() throws LockReleaseFailedException {

      // TODO: wierd that clearLock() throws the raw IOException...

      try {

        Files.deleteIfExists(lockFile);

      } catch (Throwable cause) {

        throw new LockReleaseFailedException("failed to delete " + lockFile, cause);

      }

    }

    @Override

    public boolean isLocked() {

      return Files.exists(lockFile);

    }

    @Override

    public String toString() {

      return "SimpleFSLock@" + lockFile;

    }

  }

}

在SimpleFSLockFactory定义了一个内部类SimpleFSLock继承Lock，我们还是主要看SimpleFSLockFactory的obtain方法，这里就是SimpleFSLock具体实现文件锁的代码。

Files.createDirectories(lockDir);

Files.createFile(lockFile);

可以看着两行代码，createDirectories建立write.lock（可以是别的文件名，lucene默认使用write.lock）文件所在的文件夹及父文件夹。createFile则是创建write.lock文件，这里有一个精妙的地方，如果write.lock已经存在，那么createFile则会抛出异常，如果抛出异常，则表明SimpleFSLockFactory维持文件锁失败，也即意味着别的进程正在写索引文件。

看到close()方法中Files.deleteIfExists(lockFile); 就表示如果每次关闭IndexWriter，则会删除write.lock文件。

总结一下，SimpleFSLockFactory加文件锁的机制可以通俗的理解为，在索引文件所在的目录下，创建一个write.lock文件，如果此文件夹下已经有write.lock文件，则表明已经有其他进程在写当前的索引目录，所以此次添加文件锁失败，也即不能像索引文件中添加信息。每次添加完信息后，则会删除write.lock文件，释放文件锁。也即如果write.lock文件存在，就表明已经有进程在写索引文件，如果write.lock不存在就创建文件并添加了文件锁，别的进程不能写文件。

这是一个非常精妙的方式去实现写文件锁，当然可能有些读者会疑惑为什么自己在Demo中，创建完索引，close后还有write.lock文件存在，因为现在lucene的默认实现是NativeFSLockFactory，也是上文提及的使用nio调用本地方法去实现的lock。

细说Lucene源码(一)：索引文件锁机制的更多相关文章

Lucene源码
看Lucene源码必须知道的基本概念终于有时间总结点Lucene,虽然是大周末的,已经感觉是对自己的奖励,毕竟只是喜欢,现在的工作中用不到的.自己看源码比较快,看英文原著的技术书也很快.都和语言有很 ...
鸿蒙内核源码分析(索引节点篇) | 谁是文件系统最重要的概念 | 百篇博客分析OpenHarmony源码 | v64.01
百篇博客系列篇.本篇为: v64.xx 鸿蒙内核源码分析(索引节点篇) | 谁是文件系统最重要的概念 | 51.c.h.o 文件系统相关篇为: v62.xx 鸿蒙内核源码分析(文件概念篇) | 为什么 ...
Lucene 源码分析之倒排索引（三）
上文找到了 collect(-) 方法,其形参就是匹配的文档 Id,根据代码上下文,其中 doc 是由 iterator.nextDoc() 获得的,那 DefaultBulkScorer.itera ...
一个lucene源码分析的博客
ITpub上的一个lucene源码分析的博客,写的比较全面:http://blog.itpub.net/28624388/cid-93356-list-1/
详解 QT 源码之 Qt 事件机制原理
QT 源码之 Qt 事件机制原理是本文要介绍的内容,在用Qt写Gui程序的时候,在main函数里面最后依据都是app.exec();很多书上对这句的解释是,使 Qt 程序进入消息循环.下面我们就到ex ...
lucene源码分析的一些资料
针对lucene6.1较新的分析:http://46aae4d1e2371e4aa769798941cef698.devproxy.yunshipei.com/conansonic/article/d ...
kernel 3.10内核源码分析--hung task机制
kernel 3.10内核源码分析--hung task机制一.相关知识: 长期以来,处于D状态(TASK_UNINTERRUPTIBLE状态)的进程都是让人比较烦恼的问题,处于D状态的进程不能接 ...
菜鸟学习Fabric源码学习 — kafka共识机制
Fabric 1.4源码分析 kafka共识机制本文档主要介绍kafka共识机制流程.在查看文档之前可以先阅览raft共识流程以及orderer服务启动流程. 1. kafka 简介 Kafka是最 ...
看Lucene源码必须知道的基本概念
终于有时间总结点Lucene,虽然是大周末的,已经感觉是对自己的奖励,毕竟只是喜欢,现在的工作中用不到的.自己看源码比较快,看英文原著的技术书也很快.都和语言有很大关系.虽然咱的技术不敢说是部门第一的 ...

随机推荐

关于Asp.Net中避免用户连续多次点击按钮，重复提交表单的处理
Web页面中经常碰到这类问题,就是客户端多次点击一个按钮或者链接,导致程序出现不可预知的麻烦. 客户就是上帝,他们也不是有意要给你的系统造成破坏,这么做的原因很大一部分是因为网络慢,点击一个操作之后, ...
Centos7搭建php+mysql环境(整理篇）
终于将mysql+php环境搭建成功,将之前的整理一下,环境:centos7,本机IP:192.168.1.24,数据库用户名及密码都设为root,测试文件路径:/var/www/html 1.取消c ...
eclipse中show whitespace characters显示代码空格，TAB，回车导致代码乱恶心
Eclipse中show whitespace characters显示回车.空格符. 取消此功能的第二种方式:
excel poi 文件导出，支持多sheet、多列自动合并。
参考博客: http://www.oschina.net/code/snippet_565430_15074 增加了多sheet,多列的自动合并. 修改了部分过时方法和导出逻辑. 优化了标题,导出信息 ...
学习笔记--【转】Parameter与Attribute的区别&servletContext与ServletConfig区别
原文链接http://blog.csdn.net/saygoodbyetoyou/article/details/9006001 Parameter与Attribute的区别 request. ...
getJSON回调函数不执行问题？
利用getJSON异步请求时,回调函数不执行,不知道是什么问题? php 返回数据 header("Content-type:text/json"); echo json_enco ...
IE 6最小最大宽度与高度的写法
最小最大宽度,最小最大高度,这是CSS很常见的一个要求.在现代浏览器中,一个 min-height,min-width 就可以解决问题,但是在IE系列,比如IE6则比较繁琐一点.下面总结一些IE 6下 ...
Oracle数据库还原方法
Win +X → 运行→cmd C:\Documents and Settings\Administrator>sqlplus /nolog SQL> connect sys/passwo ...
BZOJ 1013 球形空间产生器
Description 有一个球形空间产生器能够在n维空间中产生一个坚硬的球体.现在,你被困在了这个n维球体中,你只知道球面上n+1个点的坐标,你需要以最快的速度确定这个n维球体的球心坐标,以便于摧毁 ...
Java工程转换为Maven工程
1. 前言在开发中经常要建立一个Maven的子工程,对于没有模板的同学来说从Java工程来转换也是一个不错的选择.本文就如何从一个Java工程创建一个Maven工程做了一个介绍,相信对于将一个Jav ...

细说Lucene源码(一)：索引文件锁机制

细说Lucene源码(一)：索引文件锁机制的更多相关文章

随机推荐

热门专题