百度Elasticsearch-产品描述-介绍-百度云 https://cloud.baidu.com/doc/BES/FAQ.html#.2C.BB.93.08.C9.7E.2F.A3.E7.35.BE.E5.FA.BD.F6.0E Es 中一个分片一般设置多大 ES 的每个分片(shard)都是lucene的一个index,而lucene的一个index只能存储20亿个文档,所以一个分片也只能最多存储20亿个文档. 另外,我们也建议一个分片的大小在10G-50G之间,太大的话查询时会比较慢,
org.apache.lucene.index Enum Constants Enum Constant and Description DOCS_AND_FREQS Only documents and term frequencies are indexed: positions are omitted. DOCS_AND_FREQS_AND_POSITIONS Indexes documents, frequencies and positions. DOCS_AND_FREQS_AND
Tuning BM25 One of the nice features of BM25 is that, unlike TF/IDF, it has two parameters that allow it to be tuned: k1 This parameter controls how quickly an increase in term frequency results in term-frequency saturation. The default value is 1.2.