NEO4J亿级数据全文索引构建优化
NEO4J亿级数据全文索引构建优化
如果使用基于NEO4J的全文检索作为图谱的主要入口,那么做好图谱搜索引擎的优化是非常关键的。
一、数据量规模(亿级)
count(relationships):500584016
count(nodes):765485810
二、构建索引的方式
使用脚本后服务器台执行构建全文索引的操作。
使用后台脚本执行构建索引程序:
index.sh
#!/usr/bin/env bash
nohup /neo4j-community-3.4.9/bin/neo4j-shell -file build.cql >>indexGraph.log 2>&1 &
build.cql
CALL zdr.index.addChineseFulltextIndex('IKAnalyzer', ['description','fullname','name','lnkurl'], 'LinkedinID') YIELD message RETURN message;
三、构建索引发生的异常
ERROR (-v for expanded information):
TransactionFailureException: The database has encountered a critical error, and needs to be restarted. Please see database logs for more details.
-host Domain name or IP of host to connect to (default: localhost)
-port Port of host to connect to (default: 1337)
-name RMI name, i.e. rmi://<host>:<port>/<name> (default: shell)
-pid Process ID to connect to
-c Command line to execute. After executing it the shell exits
-file File containing commands to execute, or '-' to read from stdin. After executing it the shell exits
-readonly Connect in readonly mode (only for connecting with -path)
-path Points to a neo4j db path so that a local server can be started there
-config Points to a config file when starting a local server
Example arguments for remote:
-port 1337
-host 192.168.1.234 -port 1337 -name shell
-host localhost -readonly
...or no arguments for default values
Example arguments for local:
-path /path/to/db
-path /path/to/db -config /path/to/neo4j.config
-path /path/to/db -readonly
Caused by: java.lang.OutOfMemoryError: Java heap space | GB+Tree[file:/u02/isi/zdr/graph/neo4j-community-3.4.9/data/databases/graph.db/schema/index/lucene_native-2.0/134/string-1.0/index-134, layout:StringLayout[version:0.1, identifier:24016946018123776], generation:16587/16588]
at org.neo4j.io.pagecache.impl.muninn.CursorFactory.takeWriteCursor(CursorFactory.java:62)
at org.neo4j.io.pagecache.impl.muninn.MuninnPagedFile.io(MuninnPagedFile.java:186)
at org.neo4j.index.internal.gbptree.FreeListIdProvider.releaseId(FreeListIdProvider.java:217)
at org.neo4j.index.internal.gbptree.InternalTreeLogic.createSuccessorIfNeeded(InternalTreeLogic.java:1289)
at org.neo4j.index.internal.gbptree.InternalTreeLogic.insertInLeaf(InternalTreeLogic.java:513)
at org.neo4j.index.internal.gbptree.InternalTreeLogic.insert(InternalTreeLogic.java:356)
at org.neo4j.index.internal.gbptree.GBPTree$SingleWriter.merge(GBPTree.java:1234)
at org.neo4j.kernel.impl.index.schema.NativeSchemaIndexUpdater.processAdd(NativeSchemaIndexUpdater.java:132)
at org.neo4j.kernel.impl.index.schema.NativeSchemaIndexUpdater.processUpdate(NativeSchemaIndexUpdater.java:86)
at org.neo4j.kernel.impl.index.schema.NativeSchemaIndexUpdater.process(NativeSchemaIndexUpdater.java:61)
at org.neo4j.kernel.impl.index.schema.fusion.FusionIndexUpdater.process(FusionIndexUpdater.java:41)
at org.neo4j.kernel.impl.api.index.updater.DelegatingIndexUpdater.process(DelegatingIndexUpdater.java:40)
at org.neo4j.kernel.impl.api.index.IndexingService.processUpdate(IndexingService.java:516)
at org.neo4j.kernel.impl.api.index.IndexingService.apply(IndexingService.java:479)
at org.neo4j.kernel.impl.api.index.IndexingService.apply(IndexingService.java:463)
at org.neo4j.kernel.impl.transaction.command.IndexUpdatesWork.apply(IndexUpdatesWork.java:63)
at org.neo4j.kernel.impl.transaction.command.IndexUpdatesWork.apply(IndexUpdatesWork.java:42)
at org.neo4j.concurrent.WorkSync.doSynchronizedWork(WorkSync.java:231)
at org.neo4j.concurrent.WorkSync.tryDoWork(WorkSync.java:157)
at org.neo4j.concurrent.WorkSync.apply(WorkSync.java:91)
JAVA代码实现索引
/**
* @param
* @return
* @Description: TODO(构建索引并返回MESSAGE - 不支持自动更新)
*/
private String chineseFulltextIndex(String indexName, String labelName, List<String> propKeys) {
Label label <span class="token operator">=</span> Label<span class="token punctuation">.</span><span class="token function">label</span><span class="token punctuation">(</span>labelName<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token comment"><span class="hljs-comment">// 按照标签找到该标签下的所有节点</span></span>
ResourceIterator<span class="token generics function"><span class="token punctuation"><</span>Node<span class="token punctuation">></span></span> nodes <span class="token operator">=</span> db<span class="token punctuation">.</span><span class="token function">findNodes</span><span class="token punctuation">(</span>label<span class="token punctuation">)</span><span class="token punctuation">;</span>
System<span class="token punctuation">.</span>out<span class="token punctuation">.</span><span class="token function">println</span><span class="token punctuation">(</span><span class="token string"><span class="hljs-string">"nodes:"</span></span> <span class="token operator">+</span> nodes<span class="token punctuation">.</span><span class="token function">toString</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword"><span class="hljs-keyword">int</span></span> nodesSize <span class="token operator">=</span> <span class="token number"><span class="hljs-number">0</span></span><span class="token punctuation">;</span>
<span class="token keyword"><span class="hljs-keyword">int</span></span> propertiesSize <span class="token operator">=</span> <span class="token number"><span class="hljs-number">0</span></span><span class="token punctuation">;</span>
<span class="token comment"><span class="hljs-comment">// 循环存在问题 更新到3000万之后程序开始卡顿</span></span>
<span class="token keyword"><span class="hljs-keyword">while</span></span> <span class="token punctuation">(</span>nodes<span class="token punctuation">.</span><span class="token function">hasNext</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
nodesSize<span class="token operator">++</span><span class="token punctuation">;</span>
Node node <span class="token operator">=</span> nodes<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
System<span class="token punctuation">.</span>out<span class="token punctuation">.</span><span class="token function">println</span><span class="token punctuation">(</span><span class="token string"><span class="hljs-string">"current nodes:"</span></span> <span class="token operator">+</span> node<span class="token punctuation">.</span><span class="token function">toString</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token comment"><span class="hljs-comment">// 每个节点上需要添加索引的属性</span></span>
Set<span class="token operator"><</span>Map<span class="token punctuation">.</span>Entry<span class="token generics function"><span class="token punctuation"><</span>String<span class="token punctuation">,</span> Object<span class="token punctuation">></span></span><span class="token operator">></span> properties <span class="token operator">=</span> node<span class="token punctuation">.</span><span class="token function">getProperties</span><span class="token punctuation">(</span>propKeys<span class="token punctuation">.</span><span class="token function">toArray</span><span class="token punctuation">(</span><span class="token keyword"><span class="hljs-keyword">new</span></span> <span class="token class-name">String</span><span class="token punctuation">[</span><span class="token number"><span class="hljs-number">0</span></span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">entrySet</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
System<span class="token punctuation">.</span>out<span class="token punctuation">.</span><span class="token function">println</span><span class="token punctuation">(</span><span class="token string"><span class="hljs-string">"current node properties"</span></span> <span class="token operator">+</span> properties<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token comment"><span class="hljs-comment">// 查询该节点是否已有索引,有的话删除</span></span>
<span class="token keyword"><span class="hljs-keyword">if</span></span> <span class="token punctuation">(</span>db<span class="token punctuation">.</span><span class="token function">index</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">existsForNodes</span><span class="token punctuation">(</span>indexName<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
Index<span class="token generics function"><span class="token punctuation"><</span>Node<span class="token punctuation">></span></span> oldIndex <span class="token operator">=</span> db<span class="token punctuation">.</span><span class="token function">index</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">forNodes</span><span class="token punctuation">(</span>indexName<span class="token punctuation">)</span><span class="token punctuation">;</span>
System<span class="token punctuation">.</span>out<span class="token punctuation">.</span><span class="token function">println</span><span class="token punctuation">(</span><span class="token string"><span class="hljs-string">"current node index"</span></span> <span class="token operator">+</span> oldIndex<span class="token punctuation">)</span><span class="token punctuation">;</span>
oldIndex<span class="token punctuation">.</span><span class="token function">remove</span><span class="token punctuation">(</span>node<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token comment"><span class="hljs-comment">// 为该节点的每个需要添加索引的属性添加全文索引</span></span>
Index<span class="token generics function"><span class="token punctuation"><</span>Node<span class="token punctuation">></span></span> nodeIndex <span class="token operator">=</span> db<span class="token punctuation">.</span><span class="token function">index</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">forNodes</span><span class="token punctuation">(</span>indexName<span class="token punctuation">,</span> FULL_INDEX_CONFIG<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword"><span class="hljs-keyword">for</span></span> <span class="token punctuation">(</span>Map<span class="token punctuation">.</span>Entry<span class="token generics function"><span class="token punctuation"><</span>String<span class="token punctuation">,</span> Object<span class="token punctuation">></span></span> property <span class="token operator">:</span> properties<span class="token punctuation">)</span> <span class="token punctuation">{</span>
propertiesSize<span class="token operator">++</span><span class="token punctuation">;</span>
nodeIndex<span class="token punctuation">.</span><span class="token function">add</span><span class="token punctuation">(</span>node<span class="token punctuation">,</span> property<span class="token punctuation">.</span><span class="token function">getKey</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> property<span class="token punctuation">.</span><span class="token function">getValue</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token comment"><span class="hljs-comment">// 计算耗时</span></span>
<span class="token punctuation">}</span>
String message <span class="token operator">=</span> <span class="token string"><span class="hljs-string">"IndexName:"</span></span> <span class="token operator">+</span> indexName <span class="token operator">+</span> <span class="token string"><span class="hljs-string">",LabelName:"</span></span> <span class="token operator">+</span> labelName <span class="token operator">+</span> <span class="token string"><span class="hljs-string">",NodesSize:"</span></span> <span class="token operator">+</span> nodesSize <span class="token operator">+</span> <span class="token string"><span class="hljs-string">",PropertiesSize:"</span></span> <span class="token operator">+</span> propertiesSize<span class="token punctuation">;</span>
<span class="token keyword"><span class="hljs-keyword">return</span></span> message<span class="token punctuation">;</span>
<span class="token punctuation">}</span>
四、全文索引代码优化
1、Java.lang.OutOfMemoryError
Java.lang.OutOfMemory是java.lang.VirtualMachineError的一个子类,当Java虚拟机中断,或是超出可用资源时抛出。
2、访问数据库时
访问数据库时程序会获取锁和内存,在事务没有被完成之前锁和内存是不会释放的。因此现在很容易理解上述BUG的出现的原因。(三)实现的索引程序中,是获取节点之后在WHILE循环中执行构建索引,直到索引构建完毕事务才会自动被关闭,自动执行内存回收等操作。当获取的数据量巨大的时候,必然会出现内存溢出。
3、优化方案
使用批量事务提交的机制。
4、优化代码
/**
* @param
* @return
* @Description: TODO(构建索引并返回MESSAGE - 不支持自动更新)
*/
private String chineseFulltextIndex(String indexName, String labelName, List<String> propKeys) {
Label label <span class="token operator">=</span> Label<span class="token punctuation">.</span><span class="token function">label</span><span class="token punctuation">(</span>labelName<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword"><span class="hljs-keyword">int</span></span> nodesSize <span class="token operator">=</span> <span class="token number"><span class="hljs-number">0</span></span><span class="token punctuation">;</span>
<span class="token keyword"><span class="hljs-keyword">int</span></span> propertiesSize <span class="token operator">=</span> <span class="token number"><span class="hljs-number">0</span></span><span class="token punctuation">;</span>
<span class="token comment"><span class="hljs-comment">// 按照标签找到该标签下的所有节点</span></span>
ResourceIterator<span class="token generics function"><span class="token punctuation"><</span>Node<span class="token punctuation">></span></span> nodes <span class="token operator">=</span> db<span class="token punctuation">.</span><span class="token function">findNodes</span><span class="token punctuation">(</span>label<span class="token punctuation">)</span><span class="token punctuation">;</span>
Transaction tx <span class="token operator">=</span> db<span class="token punctuation">.</span><span class="token function">beginTx</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword"><span class="hljs-keyword">try</span></span> <span class="token punctuation">{</span>
<span class="token keyword"><span class="hljs-keyword">int</span></span> batch <span class="token operator">=</span> <span class="token number"><span class="hljs-number">0</span></span><span class="token punctuation">;</span>
<span class="token keyword"><span class="hljs-keyword">long</span></span> startTime <span class="token operator">=</span> System<span class="token punctuation">.</span><span class="token function">nanoTime</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword"><span class="hljs-keyword">while</span></span> <span class="token punctuation">(</span>nodes<span class="token punctuation">.</span><span class="token function">hasNext</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
nodesSize<span class="token operator">++</span><span class="token punctuation">;</span>
Node node <span class="token operator">=</span> nodes<span class="token punctuation">.</span><span class="token function">next</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword"><span class="hljs-keyword">boolean</span></span> indexed <span class="token operator">=</span> <span class="token boolean"><span class="hljs-keyword">false</span></span><span class="token punctuation">;</span>
<span class="token comment"><span class="hljs-comment">// 每个节点上需要添加索引的属性</span></span>
Set<span class="token operator"><</span>Map<span class="token punctuation">.</span>Entry<span class="token generics function"><span class="token punctuation"><</span>String<span class="token punctuation">,</span> Object<span class="token punctuation">></span></span><span class="token operator">></span> properties <span class="token operator">=</span> node<span class="token punctuation">.</span><span class="token function">getProperties</span><span class="token punctuation">(</span>propKeys<span class="token punctuation">.</span><span class="token function">toArray</span><span class="token punctuation">(</span><span class="token keyword"><span class="hljs-keyword">new</span></span> <span class="token class-name">String</span><span class="token punctuation">[</span><span class="token number"><span class="hljs-number">0</span></span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">entrySet</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token comment"><span class="hljs-comment">// 查询该节点是否已有索引,有的话删除</span></span>
<span class="token keyword"><span class="hljs-keyword">if</span></span> <span class="token punctuation">(</span>db<span class="token punctuation">.</span><span class="token function">index</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">existsForNodes</span><span class="token punctuation">(</span>indexName<span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
Index<span class="token generics function"><span class="token punctuation"><</span>Node<span class="token punctuation">></span></span> oldIndex <span class="token operator">=</span> db<span class="token punctuation">.</span><span class="token function">index</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">forNodes</span><span class="token punctuation">(</span>indexName<span class="token punctuation">)</span><span class="token punctuation">;</span>
oldIndex<span class="token punctuation">.</span><span class="token function">remove</span><span class="token punctuation">(</span>node<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token comment"><span class="hljs-comment">// 为该节点的每个需要添加索引的属性添加全文索引</span></span>
Index<span class="token generics function"><span class="token punctuation"><</span>Node<span class="token punctuation">></span></span> nodeIndex <span class="token operator">=</span> db<span class="token punctuation">.</span><span class="token function">index</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">forNodes</span><span class="token punctuation">(</span>indexName<span class="token punctuation">,</span> FULL_INDEX_CONFIG<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token keyword"><span class="hljs-keyword">for</span></span> <span class="token punctuation">(</span>Map<span class="token punctuation">.</span>Entry<span class="token generics function"><span class="token punctuation"><</span>String<span class="token punctuation">,</span> Object<span class="token punctuation">></span></span> property <span class="token operator">:</span> properties<span class="token punctuation">)</span> <span class="token punctuation">{</span>
indexed <span class="token operator">=</span> <span class="token boolean"><span class="hljs-keyword">true</span></span><span class="token punctuation">;</span>
propertiesSize<span class="token operator">++</span><span class="token punctuation">;</span>
nodeIndex<span class="token punctuation">.</span><span class="token function">add</span><span class="token punctuation">(</span>node<span class="token punctuation">,</span> property<span class="token punctuation">.</span><span class="token function">getKey</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> property<span class="token punctuation">.</span><span class="token function">getValue</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token comment"><span class="hljs-comment">// 批量提交事务</span></span>
<span class="token keyword"><span class="hljs-keyword">if</span></span> <span class="token punctuation">(</span>indexed<span class="token punctuation">)</span> <span class="token punctuation">{</span>
<span class="token keyword"><span class="hljs-keyword">if</span></span> <span class="token punctuation">(</span><span class="token operator">++</span>batch <span class="token operator">==</span> <span class="token number"><span class="hljs-number">50</span></span><span class="hljs-number">_000</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
batch <span class="token operator">=</span> <span class="token number"><span class="hljs-number">0</span></span><span class="token punctuation">;</span>
tx<span class="token punctuation">.</span><span class="token function">success</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
tx<span class="token punctuation">.</span><span class="token function">close</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
tx <span class="token operator">=</span> db<span class="token punctuation">.</span><span class="token function">beginTx</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token comment"><span class="hljs-comment">// 计算耗时</span></span>
startTime <span class="token operator">=</span> <span class="token function">indexConsumeTime</span><span class="token punctuation">(</span>startTime<span class="token punctuation">,</span> nodesSize<span class="token punctuation">,</span> propertiesSize<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
<span class="token punctuation">}</span>
<span class="token punctuation">}</span>
tx<span class="token punctuation">.</span><span class="token function">success</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token comment"><span class="hljs-comment">// 计算耗时</span></span>
<span class="token function">indexConsumeTime</span><span class="token punctuation">(</span>startTime<span class="token punctuation">,</span> nodesSize<span class="token punctuation">,</span> propertiesSize<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span> <span class="token keyword"><span class="hljs-keyword">finally</span></span> <span class="token punctuation">{</span>
tx<span class="token punctuation">.</span><span class="token function">close</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
String message <span class="token operator">=</span> <span class="token string"><span class="hljs-string">"IndexName:"</span></span> <span class="token operator">+</span> indexName <span class="token operator">+</span> <span class="token string"><span class="hljs-string">",LabelName:"</span></span> <span class="token operator">+</span> labelName <span class="token operator">+</span> <span class="token string"><span class="hljs-string">",NodesSize:"</span></span> <span class="token operator">+</span> nodesSize <span class="token operator">+</span> <span class="token string"><span class="hljs-string">",PropertiesSize:"</span></span> <span class="token operator">+</span> propertiesSize<span class="token punctuation">;</span>
<span class="token keyword"><span class="hljs-keyword">return</span></span> message<span class="token punctuation">;</span>
<span class="token punctuation">}</span>
5、执行效率测试
50_000为批次进行提交,依次累加nodeSize和propertieSize,consume还是每批提交的耗时。
可以看到在刚开始提交的时候耗时较多,之后基本上稳定在每批提交耗时:2s~5s/5万条。10亿nodes,耗时估算11h~23h之间。
Build index-nodeSize:50000,propertieSize:148777,consume:21434ms
Build index-nodeSize:100000,propertieSize:297883,consume:18493ms
Build index-nodeSize:150000,propertieSize:446936,consume:17140ms
Build index-nodeSize:200000,propertieSize:595981,consume:17323ms
Build index-nodeSize:250000,propertieSize:745039,consume:19680ms
Build index-nodeSize:300000,propertieSize:894026,consume:18451ms
Build index-nodeSize:350000,propertieSize:1042994,consume:20266ms
Build index-nodeSize:400000,propertieSize:1160186,consume:12787ms
Build index-nodeSize:450000,propertieSize:1210186,consume:1946ms
Build index-nodeSize:500000,propertieSize:1260186,consume:3174ms
Build index-nodeSize:550000,propertieSize:1310186,consume:3090ms
Build index-nodeSize:600000,propertieSize:1360186,consume:3063ms
Build index-nodeSize:650000,propertieSize:1410186,consume:1868ms
Build index-nodeSize:700000,propertieSize:1460186,consume:2036ms
Build index-nodeSize:750000,propertieSize:1510186,consume:3784ms
Build index-nodeSize:800000,propertieSize:1560186,consume:3037ms
Build index-nodeSize:850000,propertieSize:1610186,consume:2627ms
Build index-nodeSize:900000,propertieSize:1660186,consume:1900ms
Build index-nodeSize:950000,propertieSize:1710186,consume:2944ms
Build index-nodeSize:1000000,propertieSize:1760186,consume:3369ms
Build index-nodeSize:1050000,propertieSize:1810186,consume:3289ms
Build index-nodeSize:1100000,propertieSize:1860186,consume:2763ms
Build index-nodeSize:1150000,propertieSize:1910186,consume:3237ms
Build index-nodeSize:1200000,propertieSize:1960186,consume:3408ms
Build index-nodeSize:1250000,propertieSize:2010186,consume:3644ms
Build index-nodeSize:1300000,propertieSize:2060186,consume:3661ms
Build index-nodeSize:1350000,propertieSize:2110186,consume:2964ms
Build index-nodeSize:1400000,propertieSize:2160186,consume:3219ms
Build index-nodeSize:1450000,propertieSize:2210186,consume:3356ms
Build index-nodeSize:1500000,propertieSize:2260186,consume:4115ms
Build index-nodeSize:1550000,propertieSize:2310186,consume:3188ms
Build index-nodeSize:1600000,propertieSize:2360186,consume:3364ms
Build index-nodeSize:1650000,propertieSize:2410186,consume:3799ms
Build index-nodeSize:1700000,propertieSize:2460186,consume:4301ms
Build index-nodeSize:1750000,propertieSize:2510186,consume:3772ms
Build index-nodeSize:1800000,propertieSize:2560186,consume:3692ms
Build index-nodeSize:1850000,propertieSize:2610186,consume:3428ms
Build index-nodeSize:1900000,propertieSize:2660186,consume:2930ms
备注:在本次测试的数据集上执行索引构建2小时之后,此时已经被索引了1495万个NODES,速度下降明显,需要进一步优化。
Build index-nodeSize:13850000,propertieSize:14610186,consume:97290ms
Build index-nodeSize:13900000,propertieSize:14660186,consume:7441ms
Build index-nodeSize:13950000,propertieSize:14710186,consume:3730ms
Build index-nodeSize:14000000,propertieSize:14760186,consume:3512ms
Build index-nodeSize:14050000,propertieSize:14810186,consume:4545ms
Build index-nodeSize:14100000,propertieSize:14860186,consume:12100ms
Build index-nodeSize:14150000,propertieSize:14910186,consume:83071ms
Build index-nodeSize:14200000,propertieSize:14960186,consume:7417ms
Build index-nodeSize:14250000,propertieSize:15010186,consume:3579ms
Build index-nodeSize:14300000,propertieSize:15060186,consume:64841ms
Build index-nodeSize:14350000,propertieSize:15110186,consume:7553ms
Build index-nodeSize:14400000,propertieSize:15160186,consume:63141ms
Build index-nodeSize:14450000,propertieSize:15210186,consume:64316ms
Build index-nodeSize:14500000,propertieSize:15260186,consume:187510ms
Build index-nodeSize:14550000,propertieSize:15310186,consume:247571ms
Build index-nodeSize:14600000,propertieSize:15360186,consume:224611ms
Build index-nodeSize:14650000,propertieSize:15410186,consume:244539ms
Build index-nodeSize:14700000,propertieSize:15460186,consume:354684ms
Build index-nodeSize:14750000,propertieSize:15510186,consume:236970ms
Build index-nodeSize:14800000,propertieSize:15560186,consume:308532ms
Build index-nodeSize:14850000,propertieSize:15610186,consume:429815ms
Build index-nodeSize:14900000,propertieSize:15660186,consume:409451ms
Build index-nodeSize:14950000,propertieSize:15710186,consume:456980ms
构建程序在运行4个小时之后,被索引了1530万NODES,索引构建速度几乎慢到不可接受,持续优化中…
Build index-nodeSize:14750000,propertieSize:15510186,consume:236970ms
Build index-nodeSize:14800000,propertieSize:15560186,consume:308532ms
Build index-nodeSize:14850000,propertieSize:15610186,consume:429815ms
Build index-nodeSize:14900000,propertieSize:15660186,consume:409451ms
Build index-nodeSize:14950000,propertieSize:15710186,consume:456980ms
Build index-nodeSize:15000000,propertieSize:15760186,consume:447474ms
Build index-nodeSize:15050000,propertieSize:15810186,consume:580270ms
Build index-nodeSize:15100000,propertieSize:15860186,consume:840488ms
Build index-nodeSize:15150000,propertieSize:15910186,consume:573554ms
Build index-nodeSize:15200000,propertieSize:15960186,consume:748670ms
Build index-nodeSize:15250000,propertieSize:16010186,consume:1305363ms
Build index-nodeSize:15300000,propertieSize:16060186,consume:2495139ms
</div>
NEO4J亿级数据全文索引构建优化的更多相关文章
- NEO4J亿级数据导入导出以及数据更新
1.添加配置 apoc.export.file.enabled=true apoc.import.file.enabled=true dbms.directories.import=import db ...
- SQL优化(SQL TUNING)之10分钟完成亿级数据量性能优化(SQL调优)
前几天,一个用户研发QQ找我,如下: 自由的海豚. 16:12:01 岛主,我的一条SQL查不出来结果,能帮我看看不? 兰花岛主 16:12:10 多久不出结果? 自由的海豚 16:12:17 多久都 ...
- SQL优化(SQL TUNING)之10分钟完毕亿级数据量性能优化(SQL调优)
前几天.一个用户研发QQ找我,例如以下: 自由的海豚. 16:12:01 岛主,我的一条SQL查不出来结果,能帮我看看不? 兰花岛主 16:12:10 多久不出结果? 自由的海豚 16:12:17 多 ...
- 通用技术 mysql 亿级数据优化
通用技术 mysql 亿级数据优化 一定要正确设计索引 一定要避免SQL语句全表扫描,所以SQL一定要走索引(如:一切的 > < != 等等之类的写法都会导致全表扫描) 一定要避免 lim ...
- 亿级规模的Elasticsearch优化实战
Elasticsearch 的基本信息大致如图所示,这里就不具体介绍了. 本次分享主要包含两个方面的实战经验:索引性能和查询性能. 一. 索引性能(Index Performance) 首先要考虑的是 ...
- 挑战海量数据:基于Apache DolphinScheduler对千亿级数据应用实践
点亮 ️ Star · 照亮开源之路 GitHub:https://github.com/apache/dolphinscheduler 精彩回顾 近期,初灵科技的大数据开发工程师钟霈合在社区活动的线 ...
- 基于Mysql数据库亿级数据下的分库分表方案
移动互联网时代,海量的用户数据每天都在产生,基于用户使用数据的用户行为分析等这样的分析,都需要依靠数据都统计和分析,当数据量小时,问题没有暴露出来,数据库方面的优化显得不太重要,一旦数据量越来越大时, ...
- (转载)MYSQL千万级数据量的优化方法积累
转载自:http://blog.sina.com.cn/s/blog_85ead02a0101csci.html MYSQL千万级数据量的优化方法积累 1.分库分表 很明显,一个主表(也就是很重要的表 ...
- Mybatis 使用分页查询亿级数据 性能问题 DB使用ORACLE
一般用到了mybatis框架分页就不用自己写了 直接用RowBounds对象就可以实现,但这个性能确实很低 今天我用到10w级得数据分页查询,到后面几页就迭代了很慢 用于记录 1.10万级数据如下 [ ...
随机推荐
- CSS:CSS 提示工具(Tooltip)
ylbtech-CSS:CSS 提示工具(Tooltip) 1.返回顶部 1. CSS 提示工具(Tooltip) 本文我们为大家介绍如何使用 HTML 与 CSS 来创建提示工具. 提示工具在鼠标移 ...
- java lambda filter写法
datas.stream().filter(m->!m.getSerialId().equals(setting.getSerialId())).collect(Collectors.toLis ...
- HDU 6695 Welcome Party (贪心)
2019 杭电多校 10 1005 题目链接:HDU 6695 比赛链接:2019 Multi-University Training Contest 10 Problem Description T ...
- 【Java多线程系列五】列表类
一些列表类及其特性 类 线程安全 Iterator 特性 说明 Vector 是 fail-fast 内部方法用synchronized修饰,因此执行效率较低 1. 线程安全的列表类并不意味着调用它 ...
- 【Java多线程系列随笔一】浅析 Java Thread.join()
一.在研究join的用法之前,先明确两件事情. 1.join方法定义在Thread类中,则调用者必须是一个线程, 例如: Thread t = new CustomThread(); //这里一般是自 ...
- Ubuntu14.04搭建JSP与Servlet开发环境及其测试详解
一,搭建JDK开发环境 1,在Java官网下载Jdk软件包,我的系统是64位Ubuntu14.04,所以选择jdk-8u25-linux-x64.tar.gz. 2,解压Jdk软件包 tar xvzf ...
- kafka相关业务必会操作命令整理
参考:https://kafka.apache.org 服务相关命令 1.启动/停止zk > bin/zookeeper-server-start.sh config/zookeeper.pro ...
- 梯度提升树GBD
转自 http://blog.csdn.net/u014568921/article/details/49383379 另外一个很容易理解的文章 :http://www.jianshu.com/p/0 ...
- Codeforces Round #527 F - Tree with Maximum Cost /// 树形DP
题目大意: 给定一棵树 每个点都有点权 每条边的长度都为1 树上一点到另一点的距离为最短路经过的边的长度总和 树上一点到另一点的花费为距离乘另一点的点权 选定一点出发 使得其他点到该点的花费总和是最大 ...
- 枚举加countdownLatch的使用
package com.cxy.juc; import java.util.concurrent.CountDownLatch; public class CountDownlatchDemo { p ...