细说Lucene源码(一):索引文件锁机制
大家都知道,在多线程或多进程的环境中,对统一资源的访问需要特别小心,特别是在写资源时,如果不加锁,将会导致很多严重的后果,Lucene的索引也是如此,lucene对索引的读写分为IndexReader和IndexWriter,顾名思义,一个读,一个写,lucene可以对同一个索引文件建立多个IndexReader对象,但是只能有一个IndexWriter对象,这是怎么做到的呢?显而易见是需要加锁的,加锁可以保证一个索引文件只能建立一个IndexWriter对象。下面就细说Lucene索引文件锁机制:
如果我们对同一个索引文件建立多个不同的IndexWriter会怎么样呢?
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer); IndexWriter indexWriter = new IndexWriter(dir, indexWriterConfig); IndexWriterConfig indexWriterConfig2 = new IndexWriterConfig(analyzer); IndexWriter indexWriter2 = new IndexWriter(dir,indexWriterConfig2);
运行后,控制台输出:
Exception in thread "main" org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@C:\Users\new\Desktop\Lucene\write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:89) at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:755) at test.Index.index(Index.java:51) at test.Index.main(Index.java:78)
显然是不可以对同一个索引文件开启多个IndexWriter。
上面是一个比较简略的类图,可以看到lucene采用了工厂方法,这样可以方便扩展其他实现,这里只以SimpleFsLock为例说明lucene的锁机制(其他的有兴趣可以看lucene源码)。
Lock类是锁的基类,一个抽象类,源码如下:
public abstract class Lock implements Closeable { /** How long {@link #obtain(long)} waits, in milliseconds,
* in between attempts to acquire the lock. */
public static long LOCK_POLL_INTERVAL = 1000; /** Pass this value to {@link #obtain(long)} to try
* forever to obtain the lock. */
public static final long LOCK_OBTAIN_WAIT_FOREVER = -1; /** Attempts to obtain exclusive access and immediately return
* upon success or failure. Use {@link #close} to
* release the lock.
* @return true iff exclusive access is obtained
*/
public abstract boolean obtain() throws IOException; /**
* If a lock obtain called, this failureReason may be set
* with the "root cause" Exception as to why the lock was
* not obtained.
*/
protected Throwable failureReason; /** Attempts to obtain an exclusive lock within amount of
* time given. Polls once per {@link #LOCK_POLL_INTERVAL}
* (currently 1000) milliseconds until lockWaitTimeout is
* passed.
* @param lockWaitTimeout length of time to wait in
* milliseconds or {@link
* #LOCK_OBTAIN_WAIT_FOREVER} to retry forever
* @return true if lock was obtained
* @throws LockObtainFailedException if lock wait times out
* @throws IllegalArgumentException if lockWaitTimeout is
* out of bounds
* @throws IOException if obtain() throws IOException
*/
public final boolean obtain(long lockWaitTimeout) throws IOException {
failureReason = null;
boolean locked = obtain();
if (lockWaitTimeout < 0 && lockWaitTimeout != LOCK_OBTAIN_WAIT_FOREVER)
throw new IllegalArgumentException("lockWaitTimeout should be LOCK_OBTAIN_WAIT_FOREVER or a non-negative number (got " + lockWaitTimeout + ")"); long maxSleepCount = lockWaitTimeout / LOCK_POLL_INTERVAL;
long sleepCount = 0;
while (!locked) {
if (lockWaitTimeout != LOCK_OBTAIN_WAIT_FOREVER && sleepCount++ >= maxSleepCount) {
String reason = "Lock obtain timed out: " + this.toString();
if (failureReason != null) {
reason += ": " + failureReason;
}
throw new LockObtainFailedException(reason, failureReason);
}
try {
Thread.sleep(LOCK_POLL_INTERVAL);
} catch (InterruptedException ie) {
throw new ThreadInterruptedException(ie);
}
locked = obtain();
}
return locked;
} /** Releases exclusive access. */
public abstract void close() throws IOException; /** Returns true if the resource is currently locked. Note that one must
* still call {@link #obtain()} before using the resource. */
public abstract boolean isLocked() throws IOException; /** Utility class for executing code with exclusive access. */
public abstract static class With {
private Lock lock;
private long lockWaitTimeout; /** Constructs an executor that will grab the named lock. */
public With(Lock lock, long lockWaitTimeout) {
this.lock = lock;
this.lockWaitTimeout = lockWaitTimeout;
} /** Code to execute with exclusive access. */
protected abstract Object doBody() throws IOException; /** Calls {@link #doBody} while <i>lock</i> is obtained. Blocks if lock
* cannot be obtained immediately. Retries to obtain lock once per second
* until it is obtained, or until it has tried ten times. Lock is released when
* {@link #doBody} exits.
* @throws LockObtainFailedException if lock could not
* be obtained
* @throws IOException if {@link Lock#obtain} throws IOException
*/
public Object run() throws IOException {
boolean locked = false;
try {
locked = lock.obtain(lockWaitTimeout);
return doBody();
} finally {
if (locked) {
lock.close();
}
}
}
} }
里面最重要的方法就是obtain(),这个方法用来维持锁,建立锁之后,维持时间为LOCK_POLL_INTERVAL,之后需要重新申请维持锁,这样做是为了支持多线程读写。当然也可以将lockWaitTimeout设置为-1,这样就是一直维持写锁。
抽象基类LockFactory,只定义了一个抽象方法makeLock,返回Lock对象的一个实例。
public abstract class LockFactory { /**
* Return a new Lock instance identified by lockName.
* @param lockName name of the lock to be created.
*/
public abstract Lock makeLock(Directory dir, String lockName); }
抽象类FSLockFactory继承Lock:
public abstract class FSLockFactory extends LockFactory { /** Returns the default locking implementation for this platform.
* This method currently returns always {@link NativeFSLockFactory}.
*/
public static final FSLockFactory getDefault() {
return NativeFSLockFactory.INSTANCE;
} @Override
public final Lock makeLock(Directory dir, String lockName) {
if (!(dir instanceof FSDirectory)) {
throw new UnsupportedOperationException(getClass().getSimpleName() + " can only be used with FSDirectory subclasses, got: " + dir);
}
return makeFSLock((FSDirectory) dir, lockName);
} /** Implement this method to create a lock for a FSDirectory instance. */
protected abstract Lock makeFSLock(FSDirectory dir, String lockName); }
可以看到
public static final FSLockFactory getDefault() {
return NativeFSLockFactory.INSTANCE;
}
这个方法默认返回NativeFSLockFactory,和SimpleFSLockFactory一样是一个具体实现,NativeFSLockFactory使用的是nio中FileChannel.tryLock方法,这里不展开讨论,有兴趣的读者可以去看jdk nio的源码(好像现在oracle不提供FileChannel实现类的源码了,需要去jvm里找)。
下面就是本篇文章的重头戏,SimpleFSLockFactory
public final class SimpleFSLockFactory extends FSLockFactory { /**
* Singleton instance
*/
public static final SimpleFSLockFactory INSTANCE = new SimpleFSLockFactory(); private SimpleFSLockFactory() {} @Override
protected Lock makeFSLock(FSDirectory dir, String lockName) {
return new SimpleFSLock(dir.getDirectory(), lockName);
} static class SimpleFSLock extends Lock { Path lockFile;
Path lockDir; public SimpleFSLock(Path lockDir, String lockFileName) {
this.lockDir = lockDir;
lockFile = lockDir.resolve(lockFileName);
} @Override
public boolean obtain() throws IOException {
try {
Files.createDirectories(lockDir);
Files.createFile(lockFile);
return true;
} catch (IOException ioe) {
// On Windows, on concurrent createNewFile, the 2nd process gets "access denied".
// In that case, the lock was not aquired successfully, so return false.
// We record the failure reason here; the obtain with timeout (usually the
// one calling us) will use this as "root cause" if it fails to get the lock.
failureReason = ioe;
return false;
}
} @Override
public void close() throws LockReleaseFailedException {
// TODO: wierd that clearLock() throws the raw IOException...
try {
Files.deleteIfExists(lockFile);
} catch (Throwable cause) {
throw new LockReleaseFailedException("failed to delete " + lockFile, cause);
}
} @Override
public boolean isLocked() {
return Files.exists(lockFile);
} @Override
public String toString() {
return "SimpleFSLock@" + lockFile;
}
} }
在SimpleFSLockFactory定义了一个内部类SimpleFSLock继承Lock,我们还是主要看SimpleFSLockFactory的obtain方法,这里就是SimpleFSLock具体实现文件锁的代码。
Files.createDirectories(lockDir); Files.createFile(lockFile);
可以看着两行代码,createDirectories建立write.lock(可以是别的文件名,lucene默认使用write.lock)文件所在的文件夹及父文件夹。createFile则是创建write.lock文件,这里有一个精妙的地方,如果write.lock已经存在,那么createFile则会抛出异常,如果抛出异常,则表明SimpleFSLockFactory维持文件锁失败,也即意味着别的进程正在写索引文件。
看到close()方法中Files.deleteIfExists(lockFile); 就表示如果每次关闭IndexWriter,则会删除write.lock文件。
总结一下,SimpleFSLockFactory加文件锁的机制可以通俗的理解为,在索引文件所在的目录下,创建一个write.lock文件,如果此文件夹下已经有write.lock文件,则表明已经有其他进程在写当前的索引目录,所以此次添加文件锁失败,也即不能像索引文件中添加信息。每次添加完信息后,则会删除write.lock文件,释放文件锁。也即如果write.lock文件存在,就表明已经有进程在写索引文件,如果write.lock不存在就创建文件并添加了文件锁,别的进程不能写文件。
这是一个非常精妙的方式去实现写文件锁,当然可能有些读者会疑惑为什么自己在Demo中,创建完索引,close后还有write.lock文件存在,因为现在lucene的默认实现是NativeFSLockFactory,也是上文提及的使用nio调用本地方法去实现的lock。
细说Lucene源码(一):索引文件锁机制的更多相关文章
- Lucene源码
看Lucene源码必须知道的基本概念 终于有时间总结点Lucene,虽然是大周末的,已经感觉是对自己的奖励,毕竟只是喜欢,现在的工作中用不到的.自己看源码比较快,看英文原著的技术书也很快.都和语言有很 ...
- 鸿蒙内核源码分析(索引节点篇) | 谁是文件系统最重要的概念 | 百篇博客分析OpenHarmony源码 | v64.01
百篇博客系列篇.本篇为: v64.xx 鸿蒙内核源码分析(索引节点篇) | 谁是文件系统最重要的概念 | 51.c.h.o 文件系统相关篇为: v62.xx 鸿蒙内核源码分析(文件概念篇) | 为什么 ...
- Lucene 源码分析之倒排索引(三)
上文找到了 collect(-) 方法,其形参就是匹配的文档 Id,根据代码上下文,其中 doc 是由 iterator.nextDoc() 获得的,那 DefaultBulkScorer.itera ...
- 一个lucene源码分析的博客
ITpub上的一个lucene源码分析的博客,写的比较全面:http://blog.itpub.net/28624388/cid-93356-list-1/
- 详解 QT 源码之 Qt 事件机制原理
QT 源码之 Qt 事件机制原理是本文要介绍的内容,在用Qt写Gui程序的时候,在main函数里面最后依据都是app.exec();很多书上对这句的解释是,使 Qt 程序进入消息循环.下面我们就到ex ...
- lucene源码分析的一些资料
针对lucene6.1较新的分析:http://46aae4d1e2371e4aa769798941cef698.devproxy.yunshipei.com/conansonic/article/d ...
- kernel 3.10内核源码分析--hung task机制
kernel 3.10内核源码分析--hung task机制 一.相关知识: 长期以来,处于D状态(TASK_UNINTERRUPTIBLE状态)的进程 都是让人比较烦恼的问题,处于D状态的进程不能接 ...
- 菜鸟学习Fabric源码学习 — kafka共识机制
Fabric 1.4源码分析 kafka共识机制 本文档主要介绍kafka共识机制流程.在查看文档之前可以先阅览raft共识流程以及orderer服务启动流程. 1. kafka 简介 Kafka是最 ...
- 看Lucene源码必须知道的基本概念
终于有时间总结点Lucene,虽然是大周末的,已经感觉是对自己的奖励,毕竟只是喜欢,现在的工作中用不到的.自己看源码比较快,看英文原著的技术书也很快.都和语言有很大关系.虽然咱的技术不敢说是部门第一的 ...
随机推荐
- 80端口被占用 PID = 4解决办法
请按照下面的步骤来运行命令:1. sc config http stat = demand2. reboot3. run the command(netsh http show servicestat ...
- wdcp对default站点开启apache url重写功能
网站开启对default网站的重写功能
- 局部变量存储区域静态变量存储区域static变量存储区域
局部变量存储区域静态变量存储区域static变量存储区域 常见的存储区域可分为: 1.栈 由编译器在需要的时候分配,在不需要的时候自动清楚的变量的存储区.里面的变量通常是局部变量.函数参数等. 2.堆 ...
- C#☞软件设计模型_基础
建模图有助于理解.阐明和传达代码的构思和软件系统必须支持的用户需求. 若要描述和传达用户需求,您可以使用统一建模语言 (UML) 用例图.活动图.类图和序列图. 若要描述和传达系统的功能,您可以使用 ...
- JavaScript不可变原始值和可变的对象引用
一.JavaScript不可变原始值 JavaScript中的原始值(undefined,null,布尔值,数字和字符串)与对象(包括了数组和函数)有着根本的区别.原始值是不可变的(undefined ...
- Django Admin 简单部署上线
前言 打算为公司弄一个管理公用密码的平台,由于比较懒,就选择使用Django admin,默认的admin并不漂亮,于是我使用了这个django-suit插件来美化 如图: 是不是比原来的漂亮多了. ...
- php开发入门教程
LAMP window:WAMP(windows,apache,mysql,php) LAMP是 Linux,Apache,MySQL和PHP的缩写,是我们提供 Web 服务的软件基础. 对于 Lin ...
- 转:Gulp使用指南
原文来自于:http://www.techug.com/gulp Grunt靠边,全新的建构工具来了.Gulp的code-over-configuration不只让撰写任务(tasks)更加容易,也更 ...
- 【Java】Servlet 工作原理解析
Web 技术成为当今主流的互联网 Web 应用技术之一,而 Servlet 是 Java Web 技术的核心基础.因而掌握 Servlet 的工作原理是成为一名合格的 Java Web 技术开发人员的 ...
- cf E. George and Cards
http://codeforces.com/contest/387/problem/E 题意:给你n个数,然后在输入k个数,这k个数都在n个数中出现,进行每一次操作就是在n个数中选择长度为w的连续序列 ...