Pod Lifecycle

This page describes the lifecycle of a Pod.

Pod phase

A Pod’s status field is a PodStatus object, which has a phase field.

The phase of a Pod is a simple, high-level summary of where the Pod is in its lifecycle. The phase is not intended to be a comprehensive rollup of observations of Container or Pod state, nor is it intended to be a comprehensive state machine.

The number and meanings of Pod phase values are tightly guarded. Other than what is documented here, nothing should be assumed about Pods that have a given phase value.

Pod的status域是一个PodStatus对象,包含phase域。phase描述了Pod的状态。

  • Pending表示kubernetes接受到请求,但是容器还没有被创建完成。可能是因为在调度、下载镜像。
  • Running表示Pod被创建完成,Pod中所有的容器被创建完成,至少一个容器还在运行或者在重启。
  • Successed表示Pod中所有的容器都被终止,且不会被重启。
  • Failed表示Pod中所有容器都被终止,至少一个容器没有被正常终止,即容器终止后返回非零或者系统终止容器。
  • Unknown,无法获取Pod的状态,一般是由于Pod所在物理机的通信异常。

Here are the possible values for phase:

  • Pending: The Pod has been accepted by the Kubernetes system, but one or more of the Container images has not been created. This includes time before being scheduled as well as time spent downloading images over the network, which could take a while.

  • Running: The Pod has been bound to a node, and all of the Containers have been created. At least one Container is still running, or is in the process of starting or restarting.

  • Succeeded: All Containers in the Pod have terminated in success, and will not be restarted.

  • Failed: All Containers in the Pod have terminated, and at least one Container has terminated in failure. That is, the Container either exited with non-zero status or was terminated by the system.

  • Unknown: For some reason the state of the Pod could not be obtained, typically due to an error in communicating with the host of the Pod.

Pod conditions

A Pod has a PodStatus, which has an array of PodConditions. Each element of the PodCondition array has a type field and a status field. The type field is a string, with possible values PodScheduled, Ready, Initialized, and Unschedulable. The status field is a string, with possible values True, False, and Unknown.

PodStatus是PodConditions数组。其中PodCondition有type域和status域。 type是一个字符串,可选值有PodScheduled、Ready、Initialized、Unschedulable。status域是一个字符串,可选值有True、False和Unkown。

Container probes

Probe is a diagnostic performed periodically by the kubelet on a Container. To perform a diagnostic, the kubelet calls a Handler implemented by the Container. There are three types of handlers:

  • ExecAction: Executes a specified command inside the Container. The diagnostic is considered successful if the command exits with a status code of 0.

  • TCPSocketAction: Performs a TCP check against the Container’s IP address on a specified port. The diagnostic is considered successful if the port is open.

  • HTTPGetAction: Performs an HTTP Get request against the Container’s IP address on a specified port and path. The diagnostic is considered successful if the response has a status code greater than or equal to 200 and less than 400.

探针是kubelet在容器上周期进行的诊断,通过调用容器实现的Handler实现诊断。有三种handler:

  • ExecAction,执行容器中的特定命令,返回0时表示诊断成功。
  • TCPSocketAction,对容器的IP地址上的特定端口进行TCP检查。如果端口被监听,则诊断成功。
  • HTTPGetAction,对容器的特定端口和路径执行HTTP Get请求。如果返回状态码大于等于200且小于400,则诊断成功。

Each probe has one of three results:

  • Success: The Container passed the diagnostic.
  • Failure: The Container failed the diagnostic.
  • Unknown: The diagnostic failed, so no action should be taken.

每个探针有三种结果:

  • Success,通过诊断
  • Failure,没有通过诊断
  • Unknown,诊断过程失败,so no action should be taken.

The kubelet can optionally perform and react to two kinds of probes on running Containers:

  • livenessProbe: Indicates whether the Container is running. If the liveness probe fails, the kubelet kills the Container, and the Container is subjected to its restart policy. If a Container does not provide a liveness probe, the default state is Success.

  • readinessProbe: Indicates whether the Container is ready to service requests. If the readiness probe fails, the endpoints controller removes the Pod’s IP address from the endpoints of all Services that match the Pod. The default state of readiness before the initial delay is Failure. If a Container does not provide a readiness probe, the default state is Success.

kubelet可以选择性地执行两种探针,并对结果进行处理。探针有:

  • livenessProbe,检查容器是否在运行。如果诊断失败,kubelet会杀掉容器,根据restart policy处理容器。默认为Success。
  • readinessProbe,检查容器是否可以处理service requests。如果readinessProbe失败了,endpoints controller会将Pod从所有满足条件的service endPoints中删除。初始化延迟之前,默认readiness 状态是Failure。如果容器不包含readinessProbe,默认值为Success。

When should you use liveness or readiness probes?

If the process in your Container is able to crash on its own whenever it encounters an issue or becomes unhealthy, you do not necessarily need a liveness probe; the kubelet will automatically perform the correct action in accordance with the Pod’s restartPolicy.

If you’d like your Container to be killed and restarted if a probe fails, then specify a liveness probe, and specify a restartPolicy of Always or OnFailure.

If you’d like to start sending traffic to a Pod only when a probe succeeds, specify a readiness probe. In this case, the readiness probe might be the same as the liveness probe, but the existence of the readiness probe in the spec means that the Pod will start without receiving any traffic and only start receiving traffic after the probe starts succeeding.

If you want your Container to be able to take itself down for maintenance, you can specify a readiness probe that checks an endpoint specific to readiness that is different from the liveness probe.

Note that if you just want to be able to drain requests when the Pod is deleted, you do not necessarily need a readiness probe; on deletion, the Pod automatically puts itself into an unready state regardless of whether the readiness probe exists. The Pod remains in the unready state while it waits for the Containers in the Pod to stop.

如果容器中的进程可以在遇到问题或错误时自己挂掉,那么就不需要liveness probe,kubelet会根据Pod的重启策略自动执行相应的行动。

如果当探针失败时,需要重启容器,那么就设置一个liveness probe,设置restartPolicy为Always或OnFailure。

如果当探针成功时,想要向Pod开始发送请求,那么就设置一个readiness probe。这时readiness probe可能类似于liveness probe。但是spec中的readiness probe还表示,只有当readiness probe成功后,pod才会开始接收到请求。

如果你想让容器因为维护自行挂掉,那么需要设置单独检查readiness的readiness probe,并且和liveness probe不同。

如果你想让被删除的Pod不接受请求,那么不需要设置readiness probe。当Pod被删除时,pod的状态变为unready state,不管是否存在readiness probe,并一直保存这个状态,等待Pod中的容器被关掉。

Pod and Container status

For detailed information about Pod Container status, see PodStatus and ContainerStatus. Note that the information reported as Pod status depends on the current ContainerState.

关于容器的状态,可以参考PodStatus和ContainerStatus。Pod status依赖当前的容器状态。

Restart policy

A PodSpec has a restartPolicy field with possible values Always, OnFailure, and Never. The default value is Always. restartPolicy applies to all Containers in the Pod. restartPolicy only refers to restarts of the Containers by the kubelet on the same node. Failed Containers that are restarted by the kubelet are restarted with an exponential back-off delay (10s, 20s, 40s …) capped at five minutes, and is reset after ten minutes of successful execution. As discussed in thePods document, once bound to a node, a Pod will never be rebound to another node.

Pod spec有个restartPolicy域,可选值有Always、OnFailure、Never。默认值为Always。restartPolicy被应用到Pod中的所有容器,只涉及同一节点上kubelet对容器的重启。挂掉的容器被kubelet重启时,延时会成指数级增长(10s,20s,40s...),上限为5分钟,重启成功10分钟后延时被重置。一旦某个Pod被绑定到一个节点,就再也不会被绑定到其他节点。

Pod lifetime

In general, Pods do not disappear until someone destroys them. This might be a human or a controller. The only exception to this rule is that Pods with a phaseof Succeeded or Failed for more than some duration (determined by the master) will expire and be automatically destroyed.

Three types of controllers are available:

  • Use a Job for Pods that are expected to terminate, for example, batch computations. Jobs are appropriate only for Pods with restartPolicy equal to OnFailure or Never.

  • Use a ReplicationControllerReplicaSet, or Deployment for Pods that are not expected to terminate, for example, web servers. ReplicationControllers are appropriate only for Pods with a restartPolicy of Always.

  • Use a DaemonSet for Pods that need to run one per machine, because they provide a machine-specific system service.

All three types of controllers contain a PodTemplate. It is recommended to create the appropriate controller and let it create Pods, rather than directly create Pods yourself. That is because Pods alone are not resilient to machine failures, but controllers are.

If a node dies or is disconnected from the rest of the cluster, Kubernetes applies a policy for setting the phase of all Pods on the lost node to Failed.

一般情况下,除非人或者controller把Pod删掉,Pod永远不会消失。对于Succeeded 或Failed的Pods,当状态持续时间超过一定时间(master决定)后,就会超时然后被自动消耗。

有三种Controller:

  • Job,需要关闭Pods,例如批量计算。要求设置restartPolicy为OnFailure或Never。
  • ReplicationController、ReplicaSet、Deployment,不要求关闭Pods,例如web servers。RC要求restartPolicy为Always。
  • DaemonSet,每个机器上运行一个Pod,因为提供特定机器的系统服务。

所有的controller都包含Pod template。建议创建适合的controller,然后由controller创建Pods,而不是直接创建Pods。如果节点挂掉或失联,kubernetes会将该节点上所有Pods的phase设置为Failed。

Examples

Advanced liveness probe example

Liveness probes are executed by the kubelet, so all requests are made in the kubelet network namespace.

apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-http
spec:
containers:
- args:
- /server
image: k8s.gcr.io/liveness
livenessProbe:
httpGet:
# when "host" is not defined, "PodIP" will be used
# host: my-host
# when "scheme" is not defined, "HTTP" scheme will be used. Only "HTTP" and "HTTPS" are allowed
# scheme: HTTPS
path: /healthz
port: 8080
httpHeaders:
- name: X-Custom-Header
value: Awesome
initialDelaySeconds: 15
timeoutSeconds: 1
name: liveness

Example states

  • Pod is running and has one Container. Container exits with success.

    • Log completion event.
    • If restartPolicy is:
      • Always: Restart Container; Pod phase stays Running.
      • OnFailure: Pod phase becomes Succeeded.
      • Never: Pod phase becomes Succeeded.
  • Pod is running and has one Container. Container exits with failure.
    • Log failure event.
    • If restartPolicy is:
      • Always: Restart Container; Pod phase stays Running.
      • OnFailure: Restart Container; Pod phase stays Running.
      • Never: Pod phase becomes Failed.
  • Pod is running and has two Containers. Container 1 exits with failure.
    • Log failure event.
    • If restartPolicy is:
      • Always: Restart Container; Pod phase stays Running.
      • OnFailure: Restart Container; Pod phase stays Running.
      • Never: Do not restart Container; Pod phase stays Running.
    • If Container 1 is not running, and Container 2 exits:
      • Log failure event.
      • If restartPolicy is:
        • Always: Restart Container; Pod phase stays Running.
        • OnFailure: Restart Container; Pod phase stays Running.
        • Never: Pod phase becomes Failed.
  • Pod is running and has one Container. Container runs out of memory.
    • Container terminates in failure.
    • Log OOM event.
    • If restartPolicy is:
      • Always: Restart Container; Pod phase stays Running.
      • OnFailure: Restart Container; Pod phase stays Running.
      • Never: Log failure event; Pod phase becomes Failed.
  • Pod is running, and a disk dies.
    • Kill all Containers.
    • Log appropriate event.
    • Pod phase becomes Failed.
    • If running under a controller, Pod is recreated elsewhere.
  • Pod is running, and its node is segmented out.
    • Node controller waits for timeout.
    • Node controller sets Pod phase to Failed.
    • If running under a controller, Pod is recreated elsewhere.

What’s next

kubernetes concepts -- Pod Lifecycle的更多相关文章

  1. kubernetes concepts -- Pod Overview

    This page provides an overview of Pod, the smallest deployable object in the Kubernetes object model ...

  2. kubernetes之pod健康检查

    目录 kubernetes之pod健康检查 1.概述和分类 2.LivenessProbe探针(存活性探测) 3.ReadinessProbe探针(就绪型探测) 4.探针的实现方式 4.1.ExecA ...

  3. Kubernetes探索学习004--深入Kubernetes的Pod

    深入研究学习Pod 首先需要认识到Pod才是Kubernetes项目中最小的编排单位原子单位,凡是涉及到调度,网络,存储层面的,基本上都是Pod级别的!官方是用这样的语言来描述的: A Pod is ...

  4. Kubernetes之Pod使用

    一.什么是Podkubernetes中的一切都可以理解为是一种资源对象,pod,rc,service,都可以理解是 一种资源对象.pod的组成示意图如下,由一个叫”pause“的根容器,加上一个或多个 ...

  5. Kubernetes concepts 系列

    kubernetes concepts overview Pod overview Replication Controller Pod Liftcycle Termination Of Pod Re ...

  6. kubernetes调度pod运行于master节点上

    应用背景: 使用kubeadm部署的kubernetes集群,其master节点默认拒绝将pod调度运行于其上的,加点官方的术语就是:master默认被赋予了一个或者多个“污点(taints)”,“污 ...

  7. Kubernetes基石-pod容器

    引用三个问题来叙述Kubernetes的pod容器 1.为什么不直接在一个Docker容器中运行所有的应用进程. 2.为什么pod这种容器中要同时运行多个Docker容器(可以只有一个) 3.为什么k ...

  8. kubernetes删除pod一直处于terminating状态的解决方法

    kubernetes删除pod一直处理 Terminating状态 # kubectl get po -n mon NAME READY STATUS RESTARTS AGE alertmanage ...

  9. Kubernetes服务pod的健康检测liveness和readiness详解

    Kubernetes服务pod的健康检测liveness和readiness详解 接下来给大家讲解下在K8S上,我们如果对我们的业务服务进行健康检测. Health Check.restartPoli ...

随机推荐

  1. TOJ 6121: 学长的情书 ( 二分)

    传送门: 点我 6121: 学长的情书  时间限制(普通/Java):2000MS/6000MS     内存限制:65536KByte总提交: 79            测试通过:2 描述 抹布收 ...

  2. [竞赛]Beat Matching(对拍)

    对拍的基本理论这里恕我不一一叙述,不会的请转身到这里:http://blog.csdn.net/code12hour/article/details/51252457 分为以下几个部分: 1.暴力伪标 ...

  3. centos7 创建sftp

    sftp是Secure File Transfer Protocol的缩写,安全文件传送协议.可以为传输文件提供一种安全的网络的加密方法.sftp 与 ftp 有着几乎一样的语法和功能.SFTP 为  ...

  4. clickhouse创建视图SQL 错误 [47]: ClickHouse exception, code: 47

    使用clickhouse创建视图时报错 SQL 错误 [47]: ClickHouse exception, code: 47, host: localhost, port: 8123; Code: ...

  5. slim中的请求头

    请求头 每个 HTTP 请求都有请求头.这些元数据描述了 HTTP 请求,但在请求体中不可见.Slim 的 PSR 7 请求对象提供了几个检查请求头的方法. 获取所有的请求头,返回一个数组:getHe ...

  6. 一种HTML table合并单元格的思路

    /** * 合并单元格 * @param table1 表格的ID * @param startRow 起始行 * @param col 合并的列号,对第几列进行合并(从0开始).如果传下来为0就是从 ...

  7. 泛型 List转换成DataTable

    private DataTable listToDataTable<T>(List<T> ListItem) { //实列化DataTable对象 var dt = new D ...

  8. CSV 文件的存取

    CSV 文件介绍 CSV(Comma-Separated Values),中文通常叫做逗号分割值.CSV文件由任意数目的记录(行)组成,每条记录由一些字段(列)组成,字段之间通常以逗号分割,当然也可以 ...

  9. nodejs-websocket+ssl证书

    1.nodejs配置微信小程序本地服务器(二):利用ws模块创建基于ssl证书的WebSocket服务器:https://segmentfault.com/a/1190000013956534 2.n ...

  10. POJ1144 Network 题解 点双连通分量(求割点数量)

    题目链接:http://poj.org/problem?id=1144 题目大意:给以一个无向图,求割点数量. 这道题目的输入和我们一般见到的不太一样. 它首先输入 \(N\)(\(\lt 100\) ...