Kubernetes API Server, Part I
By Andrew Chen and Dominik Tornow
Kubernetes is a Container Orchestration Engine designed to host containerized applications on a set of nodes, commonly referred to as a cluster. Using a systems modeling approach, this series aims to advance the understanding of Kubernetes and its underlying concepts.
This blog post uses the Alloy Specification Language, a specification language for expressing structure and behavior based on first order logic. However a plain English description of each Alloy specification is provided.
This blog post is the first in a three part series:
- Part I describes the structure and behavior of the API Server
- Part II describes the Kubernetes API
- Part III describes the Kubernetes Object Store
Preface — The Term “API Server”
The term “API Server” is heavily overloaded and refers to multiple concepts. This blog post will use the terms API Server, Kubernetes API, and Kubernetes Object Store to denote individual concepts.
- “Kubernetes API” denotes the component that processes read and write requests and queries or modifies the Kubernetes Object Store accordingly.
- “Kubernetes Object Store” denotes the persistent set of Kubernetes Objects.
- “API Server” denotes the union of the Kubernetes API and the Kubernetes Object Store.
The API Server
The Kubernetes API Server is a core component of Kubernetes. Conceptually, the Kubernetes API Server is the database of Kubernetes, representing the state of the cluster as a set of Kubernetes Objects. Examples of Kubernetes Objects are Pod Objects, ReplicaSet Objects, and Deployment Objects.
The Kubernetes API Server exists in multiple revisions. A revision is a snapshot in time allowing arbitrary time travel, similar to a git repository:
- The Kubernetes API Server has a .rev property, short for Kubernetes API Server revision. The .rev property indicates the snapshot in time.
- The Kubernetes Objects has a .mod property, short for Kubernetes Object revision. The .mod property indicates the snapshot in time the object was last modified.
However, in practice, the implementation of the Kubernetes API Server limits time travel and discards snapshots after 5min by default.
The Kubernetes API Server exposes a CRUD (Create/Read/Update/Delete) interface without support for transactional semantics:
- A Write Request is guaranteed to be executed against the latest revision and increases the revision accordingly.
- A Read Request is not guaranteed to be executed against the latest revision, depending on the setup and configuration of the API Server.
The lack of transactional semantics leads to classical race conditions, like non-deterministic writes.
The lack of read-last-write semantics leads to two distinct effects, stale reads and out-of-order reads:
- Stale reads refer to the phenomenon that a read request may not be executed against the latest revision, therefore yielding an “outdated” response.
- Out-of-order reads refer to the phenomenon that two subsequent read requests may be executed against a higher revision first and a lower revision second, therefore yielding an out of order response.
Fencing and Freshness Tokens
A client may use the revision properties either as fencing tokens for write operations to combat missing transaction semantics, or freshness tokens for read operations to combat missing read-last-write guarantees.
In case of write operations, a client may use .rev or .mod as a fencing token. The client specifies an expected .rev or .mod value. The API Server will process the request if and only if the current .rev or .mod equals the expected value, a process known as optimistic locking.
In the case of read operations, .rev or .mod may be used as a freshness token, indicating that the read request shall return results no older than the freshness token value.
Structural Specification
- The Kubernetes API Server has a set of Kubernetes Objects and a .rev.
- Kubernetes Objects have a kind, name, namespace, and .mod.
- Objects are identified by their kind, name, and namespace triplet.
- No two distinct Kubernetes Objects in the API Server may have the same kind, name, and namespace triplet.
Behavioral Specification
Conceptually the Kubernetes API Server provides a write interface and a read interface. The write interface groups all commands that alter state, the read interface groups all commands that query state.
The Write Interface
The write interface provides the commands to create, update, and delete objects.
A Command is a state transition, moving the API Server from the current state to a subsequent state. Each command increases the revision of the API Server.
In addition, each command generates an event. An Event is a persistent, queryable record of a command execution.
Figure 8. depicts a series of commands and the resulting state transitions of the API Server.
The design and implementation of the Kubernetes API Server guarantees that the current state of the API Server at any point in time equals the aggregation of the event stream up to that point in time, a pattern also known as event sourcing.
state = reduce(apply, events, {})
Create Command
- The Create Command adds a Kubernetes Object to the API Server and sets the Object’s .mod to the API Server’s .rev.
- The Create Command is rejected if the proposed object violates the API Server’s uniqueness constraint.
- For each Create Command, one persistent and queryable Created Event is generated, so that the event’s .object field references the created Kubernetes Object.
Update Command
- The Update Command updates a Kubernetes Object in the API Server and sets the Object’s .mod to the API Server’s .rev.
- The Update Command is rejected if the command’s .mod does not match the object’s .mod. Here .mod is used as a fencing token.
- For each Update Command, one persistent and queryable Updated Event is generated, so that the event’s .object field references the new Kubernetes Object.
Delete Command
- The Delete Command removes a Kubernetes Object from the API Server.
- The Delete Command is rejected if the command’s .mod does not match the object’s .mod. Here .mod is used as a fencing token.
- For each Delete Command, one persistent and queryable Deleted Event is generated, so that the event’s .object field references the deleted Kubernetes Object.
The Read Interface
The Kubernetes API read interface provides two sub-interfaces, an object-related interface and an event-related interface.
Object-related Interface
The object related sub-interface provides the commands to read objects and list of objects.
- The Read Object Request accepts a kind, name, and namespace triplet. In addition, the request accepts a min parameter used as a freshness token.
- The API Server returns a matching Kubernetes Object at least at the revision of the API Server specified by min.
- The Read List Request accepts a kind, and namespace tuple. In addition, the request accepts a min parameter used as a freshness token.
- The API Server returns matching Kubernetes Objects at least at the revision of the API Server specified by min.
Event-related Interface
The event related sub-interface provides commands to read events regarding objects and events regarding list of objects.
- The Watch Object Request accepts a kind, name, and namespace triplet. In addition, the request accepts a min parameter used as a freshness token.
- The API Server returns all matching events starting from the revision of the API Server specified by min.
- The Watch List Request accepts a kind, and namespace tuple. In addition, the request accepts a min parameter used as a freshness token.
- The API Server returns all matching events starting from the revision of the API Server specified by min.
Example
Combining the object-related sub-interface and event related sub-interface results in an efficient query mechanism widely used in Kubernetes, such as in Kubernetes Controllers.
Instead of repeatedly polling for the current state of an object or a list of objects, a client may request the current state once and subscribe to the subsequent event stream.
pods, rev := request-object-list(kind="pods", namespace="default")for e in request-watch-list(kind="pods", namespace="default", rev) pods := apply(pods, e)
By threading the initially returned revision of the Kubernetes API Server by the read request to the watch request, the client is guaranteed to receive any event that happened between the read and write request and any event thereafter.
Implementing this pattern guarantees that the state of the client (eventually) mirrors the state of the API Server.
Conclusion
This blog post described the Kubernetes API Server’s structure and behavior. A key component in the design and implementation of adequate clients is the proper use of the Kubernetes API Server’s revision and the Kubernetes Objects’ revisions as fencing and freshness tokens.
The next blog posts examine the Kubernetes API and the Kubernetes Object Store.
About this post
This blog post is part of a collaborative effort between the CNCF, Google, and SAP to advance the understanding of Kubernetes and its underlying concepts.