What the hell is a Pod anyways?
By Andrew Chen and Dominik Tornow
Kubernetes is a Container Orchestration Engine designed to host containerized applications on a set of nodes, commonly referred to as a cluster. Using a systems modeling approach this series aims to advance the understanding of Kubernetes and its underlying concepts.
For this blog post, a basic understanding of Kubernetes is recommended.
Pods are the fundamental building blocks of Kubernetes, however even seasoned Kubernetes users struggle to describe what Pods actually are.
This blog post provides a concise mental model that highlights defining characteristics of Kubernetes Pods. However, other characteristics like Liveness Probes and Readiness Probes, sharing resources, or networking are omitted in favor of brevity.
Definition
A Pod represents a request to execute one or more containers on the same node.
A Pod is defined as the representation of a request to execute one or more containers on the same node; containers share access to resources like volumes and network stacks.
Colloquially however, the term Pod may refer to either the request or the set of containers that are executed in response to that request. This blog post will use the term “Pod” when referring to the request and “Container Set” when referring to the set of containers.
Pods are considered the fundamental building blocks of Kubernetes, because all Kubernetes workloads, like Deployments, ReplicaSets, or Jobs, are eventually expressed in terms of Pods.
Pods are the one and only objects in Kubernetes that result in the execution of containers. No Pod, no container!
Kubernetes architecture
Figure 2. highlights relevant objects and components. Pods are represented as Kubernetes Pod Objects and are processed by
- the Scheduler and
- the Kubelet.
Kubernetes Objects
Figure 3. depicts the Kubernetes Objects that are involved in processing a Pod:
- the Pod Object itself
- a Binding Object
- a Node Object
The Pod Object specifies the set of containers to be executed, the desired restart policy in case of a container failure, and tracks the status of execution.
A Binding Object binds a Pod Object to a Node Object i.e. assigns a Pod to a node for subsequent execution.
A Node Object represents a node in the Kubernetes cluster.
Processing a Pod
After a Pod is created by a user or by a controller, such as the ReplicaSet Controller or the Job Controller, Kubernetes processes the Pod in two steps:
- the Scheduler schedules the Pod
- the Kubelet executes the Pod
Scheduling of a Pod
The task of the Kubernetes Scheduler is to schedule the Pod, that is to assign an appropriate node in the Kubernetes cluster for subsequent execution.
A pod is assigned — or bound — to a node if and only if there is a binding object such that
- the Binding’s namespace equals the Pod’s namespace
- the Binding’s name equals the Pod’s name
- the Binding’s target kind equals “Node”
- the Binding’s target name equals the Node’s name
(For the adventurous reader, visit Kelsey Hightower’s GitHub gist Creating and Scheduling a Pod Manually, a step by step tutorial on how to create a Binding Object manually.)
Execution of a Pod
The task of the Kubelet is to execute the Pod, that is to execute the container set of the Pod. The Kubelet executes a Pod in two phases, the initialization phase and the main phase.
Typically the container set of the initialization phase performs preparation tasks like preparing expected directory structures and files. The container set of the main phase performs the “most important” tasks.
Colloquially, although inaccurately, the term Pod often refers to the container set of the main phase, or more specifically to the “most important” container of the main phase.
During the initialization phase, the Kubelet sequentially executes containers according to the Pod’s .Spec.InitContainers
specifications, in the order specified in the list. For a successful execution of a Pod, taking the restart policy into account, init containers are expected to run to completion and terminate successfully.
During the main phase, the Kubelet concurrently executes containers according to the Pod’s .Spec.Containers
specifications . For a successful execution of a Pod, taking the restart policy into account, main containers may run to completion and terminate successfully or run indefinitely.
In the case of a container failure, when the container terminates with an exit code other than zero (0), the Kubelet may restart the container according to the Pod’s restart policy. The restart policy is one of the following: “Always”, “OnFailure”, and “Never”.
The Pod’s restart policy has different semantics for init containers or main containers: init containers are expected to run to completion, main containers may or may not run to completion.
On termination, an init container is restarted (a new container with the same specification will be executed), if and only if
- the container’s exit code indicates failure and
- the Pod’s restart policy is either “Always” or “OnFailure”
On termination, a main container will be restarted (a new container with the same specification will be executed), if and only if
- the restart policy is “Always” or
- the restart policy is “OnFailure” and the container’s exit code indicates failure
Figure 8. illustrates a possible execution timeline of a Pod with two init containers specifications and two main containers specifications. Figure 8. also illustrates the creation of a new container “Main Container 1.2” upon the failure of “Main Container 1.1” due to the restart policy.
Pod Phases
The Kubelet retrieves the Pod’s .Spec.InitContainers
and .Spec.Containers
specification, executes the specified container set, and updates the Pod’s .Status.InitContainerStatuses
and .Status.ContainerStatuses
accordingly.
The Kubelet rolls up the Pod’s .Status.InitContainerStatuses
and .Status.ContainerStatuses
into a single value, the .Status.Phase
The Pod Phase is a projection of the state of the containers in the container set and depends on
- Init containers states and exit codes
- Main containers states and exit codes
Pending
A Pod is in the phase Pending if and only if
- none of the Pod’s Init Containers are in the state Terminated/Failure
- all of the Pod’s Main Containers are in the state Waiting
Running
A Pod is in the phase Running if and only if
- all of the Pod’s Init Containers are in the state Terminated/Success
- at least one of the Pod’s Main Containers is in the state Running
- none of the Pod’s Main Containers are in the state Terminated/Failure
Success
A Pod is in the phase Success if and only if
- all of the Pod’s Init Containers are in the state Terminated/Success
- all of the Pod’s Main Containers are in the state Terminated/Success
Failure
A Pod is in the phase Failure if and only if
- all of the Pod’s Containers are in the state Terminated
- at least one of the Pod’s Containers is in the state Terminated/Failure
Unknown
In addition to the previously described phases, a Pod may be in the phase Unknown, indicating that the actual phase of the Pod could not be determined.
Pod Garbage Collection
After the Pod has been scheduled and executed, the Kubernetes Pod Garbage Collector Controller is responsible for deleting the Kubernetes Pod Object from the Kubernetes Object Store.
Conclusion
The Pod is the fundamental building block of Kubernetes: A Pod is defined as the representation of a request to execute one or more containers on the same node. After a Pod is created, Kubernetes processes the Pod in two steps: First, the Kubernetes Scheduler schedules the Pod, second the Kubelet executes the Pod. During its lifetime, the Pod transitions through various Phases, reporting its state — or more precisely, the state of its container set — to the user and the system.
About this post
This blog post is part of a collaborative effort between the CNCF, Google, and SAP to advance the understanding of Kubernetes and its underlying concepts.