Delaying Shard Allocation

As discussed way back in Scale Horizontally, Elasticsearch will automatically balance shards between your available nodes, both when new nodes are added and when existing nodes leave.

Theoretically, this is the best thing to do. We want to recover missing primaries by promoting replicas as soon as possible. We also want to make sure resources are balanced evenly across the cluster to prevent hotspots.

In practice, however, immediately re-balancing can cause more problems than it solves. For example, consider this situation:

  1. Node 19 loses connectivity to your network (someone tripped on the power cable)
  2. Immediately, the master notices the node departure. It determines what primary shards were on Node 19 and promotes the corresponding replicas around the cluster
  3. After replicas have been promoted to primary, the master begins issuing recovery commands to rebuild the now-missing replicas. Nodes around the cluster fire up their NICs and start pumping shard data to each other in an attempt to get back to green health status
  4. This process will likely trigger a small cascade of shard movement, since the cluster is now unbalanced. Unrelated shards will be moved between hosts to accomplish better balancing

Meanwhile, the hapless admin who kicked out the power cable plugs it back in. Node 19 reboots and rejoins the cluster. Unfortunately, the node is informed that its existing data is now useless; the data being re-allocated elsewhere. So Node 19 deletes its local data and begins recovering a different set of shards from the cluster (which then causes a new minor re-balancing dance).

If this all sounds needless and expensive, you’re right. It is, but only when you know the node will be back soon. If Node 19 was truly gone, the above procedure is exactly what we want to happen.

To help address these transient outages, Elasticsearch has the ability to delay shard allocation. This gives your cluster time to see if nodes will rejoin before starting the re-balancing dance.

Changing the default delay

By default, the cluster will wait one minute to see if the node will rejoin. If the node rejoins before the timer expires, the rejoining node will use its existing shards and no shard allocation occurs.

This default time can be changed either globally, or on a per-index basis, by configuring the delayed_timeout setting:

PUT /_all/_settings 

{
"settings": {
"index.unassigned.node_left.delayed_timeout": "5m"

  }
}

By using the _all index name, we can apply this setting to all indices in the cluster

The default time is changed to 5 minutes

The setting is dynamic and can be changed at runtime. If you would like shards to allocate immediately instead of waiting, you can set delayed_timeout: 0.

Delayed allocation won’t prevent replicas from being promoted to primaries. The cluster will still perform promotions as necessary to get the cluster back to yellowstatus. The allocation of the now-missing replicas will be the only process that is delayed

Auto-cancellation of shard relocation

What happens if the node comes back after the timeout expires, but before the cluster has finished moving shards around? In this case, Elasticsearch will check to see if the on-disk data matches the current "live" data in the primary shard. If the two shards are identical — meaning there have been no new documents, updates or deletes — the master will cancel the on-going rebalancing and restore the on-disk data.

This is done since recovery of on-disk data will always be faster than transferring over the network, and since we can guarantee the shards are identical, the process is a win-win.

If the shards have diverged (e.g. new documents have been indexed since the node went down), the recovery process will continue as normal. The rejoining node will delete it’s local, out-dated shards and obtain a new set.

ES不设置副本是非常脆弱的,整个文章告诉了你为什么的更多相关文章

  1. es修改索引副本个数

    es修改索引副本个数 PUT index01/_settings { "number_of_replicas": 2 }

  2. OpenGL ES 正反面设置指令

    在OpenGL ES 中,仅有一种表面网格表示方式,那就是三角形. 三角形的三个顶点,可以组几个面?有答 1 的没有?有!那就是还不懂OpenGL ES 的我. 事实上,一张纸是有正反面的,那么一个三 ...

  3. ES里设置索引中倒排列表仅仅存文档ID——采用docs存储后可以降低pos文件和cfs文件大小

    index_options The index_options parameter controls what information is added to the inverted index, ...

  4. Python Django CMDB项目实战之-1如何开启一个Django-并设置base页、index页、文章页面

    1.环境 win10 python 2.7.14 django 1.8.2 需要用到的依赖包:MySQLdb(数据库的接口包).PIL/pillow(处理图片的包) 安装命令: pip install ...

  5. Spark2.2+ES6.4.2(三十二):ES API之index的create/update/delete/open/close(创建index时设置setting,并创建index后根据avro模板动态设置index的mapping)

    要想通过ES API对es的操作,必须获取到TransportClient对象,让后根据TransportClient获取到IndicesAdminClient对象后,方可以根据IndicesAdmi ...

  6. Elasticsearch之重要核心概念(cluster(集群)、shards(分配)、replicas(索引副本)、recovery(据恢复或叫数据重新分布)、gateway(es索引的持久化存储方式)、discovery.zen(es的自动发现节点机制机制)、Transport(内部节点或集群与客户端的交互方式)、settings(修改索引库默认配置)和mappings)

    Elasticsearch之重要核心概念如下: 1.cluster 代表一个集群,集群中有多个节点,其中有一个为主节点,这个主节点是可以通过选举产生的,主从节点是对于集群内部来说的.es的一个概念就是 ...

  7. Elasticsearch 节点磁盘使用率过高,导致ES集群索引无副本

    目录 一.问题 二.问题的原因 三.问题解决的办法 1. 扩大磁盘 2. 删除部分历史索引 3. 更改es设置 四.扩展 一.问题 最近在查看线上的 es,发现最近2天的索引没有副本,集群的状态也是为 ...

  8. mongodb replica set(副本集)设置步骤

    网上已经有一大堆的设置步骤的了,根据我遇到的问题,整理一下,如下: 首先先去下载一个mongodb最新版,目前最新版应该是2.6 cd /usr/local/bin wget http://fastd ...

  9. ES通过API调整设置

    1.查询es的设置信息 2.查询单个索引的设置 3.设置复制集为0

随机推荐

  1. c# 无法加载xxx.dll 找不到指定的模块(如何指定文件夹)

    如果直接放在项目运行目录,例如bin/debug可以直接加载,但是这样比较乱. 如果在放debug里面的一个文件夹里面,有可能会报错“无法加载xxx.dll 找不到指定的模块”. 如果路径写成这样就会 ...

  2. Codeves 2800 送外卖 状态压缩DP+floyd

    送外卖     题目描述 Description 有一个送外卖的,他手上有n份订单,他要把n份东西,分别送达n个不同的客户的手上.n个不同的客户分别在1~n个编号的城市中.送外卖的从0号城市出发,然后 ...

  3. Statspack的使用

    Statspack是Oracle 8i以上提供的一个非常好的性能监控与诊断工具,基本上全部包含了BSTAT/ESTAT的功能,更多的信息可以参考附带文档$ORACLE_HOME/rdbms/admin ...

  4. 调试相关blogs收集

    Debug Diag官方blog  https://blogs.msdn.microsoft.com/debugdiag/ Tess  https://blogs.msdn.microsoft.com ...

  5. Android studio关于点击事件后的页面跳转,选择完成后返回(onActivityResult)

    我这个人喜欢直接上代码,在代码中说明更方便,更直接. 首先在.xml中设置一个button按钮,和一个EditText框,并分别做好id号. 这里我以籍贯测试对象. <LinearLayout ...

  6. 列表查询组件代码, 简化拼接条件SQL语句的麻烦

    列表查询组件代码, 简化拼接条件SQL语句的麻烦 多条件查询

  7. Declarative programming-声明式编程-布局约束是一个案例

    声明式编程需要底层或运行时环境支持. 声明式语言的关键词确定了执行的关键控制流. 表述编程语言是说明性的东西:而不是具体的执行方案. 通常他的执行由解释器进行. In computer science ...

  8. JAVA导出csv出现0.00E+00

    导出csv出现 0.00E+00的问题,打印其值为0E-8:这是因为数据表中无对应数据(decimal),查询结果则为 0e-8. 出现的字段是多个字段相加产生的和,所以这里调用了一个相加的方法.在相 ...

  9. POJ 3370 Halloween treats( 鸽巢原理简单题 )

    链接:传送门 题意:万圣节到了,有 c 个小朋友向 n 个住户要糖果,根据以往的经验,第i个住户会给他们a[ i ]颗糖果,但是为了和谐起见,小朋友们决定要来的糖果要能平分,所以他们只会选择一部分住户 ...

  10. 【HNOI】合唱队

    [HNOI]合唱队 题意 对于一个初始序列,保证两两不同,通过一些变换得到目标序列: 第一个值直接插入空的当前队列 对于从第二个值开始的每个值 如果原序列中 $ a[i] $,若 $ a[i]> ...