Apply Semantics
This page describes how Metacontroller emulates kubectl apply
.
In most cases, you should be able to think of Metacontroller's apply semantics
as being the same as kubectl apply
, but there are some differences.
Motivation
This section explains why Metacontroller uses apply
semantics.
As an example, suppose you create a simple Pod like this
with kubectl apply -f
:
apiVersion: v1
kind: Pod
metadata:
name: my-pod
labels:
app: my-app
spec:
containers:
- name: nginx
image: nginx
If you then read back the Pod you created with kubectl get pod my-pod -o yaml
,
you'll see a lot of extra fields filled in that you never set:
apiVersion: v1
kind: Pod
metadata:
annotations:
kubernetes.io/limit-ranger: 'LimitRanger plugin set: cpu request for container
nginx'
creationTimestamp: 2018-04-13T00:46:51Z
labels:
app: my-app
name: my-pod
namespace: default
resourceVersion: "28573496"
selfLink: /api/v1/namespaces/default/pods/my-pod
uid: 27f1b2e1-3eb4-11e8-88d2-42010a800051
spec:
containers:
- image: nginx
imagePullPolicy: Always
name: nginx
resources:
requests:
cpu: 100m
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
[...]
These fields may represent materialized default values and other metadata set by the API server, values set by built-in admission control or external admission plugins, or even values set by other controllers.
Rather than sifting through all that to find the fields you care about,
kubectl apply
lets you go back to your original, simple file,
and make a change:
apiVersion: v1
kind: Pod
metadata:
name: my-pod
labels:
app: my-app
role: staging # added a label
spec:
containers:
- name: nginx
image: nginx
If you try to kubectl create -f
your updated file, it will fail because
you can't create something that already exists.
If you try to kubectl replace -f
your updated file, it will fail because
it thinks you're trying to unset all those extra fields.
However, if you use kubectl apply -f
with your updated file,
it will update only the part you changed (adding a label),
and leave all those extra fields untouched.
Metacontroller treats the desired objects you return from your hook in much the same way (but with some differences, such as support for strategic merge inside CRDs). As a result, you should always return the short form containing only the fields you care about, not the long form containing all the extra fields.
This generally means you should use the same code path to update things as you do to create them. Just generate a full JSON object from scratch every time, containing all the fields you care about, and only the fields you care about.
Metacontroller will figure out whether the object needs to be created or updated, and which fields it should and shouldn't touch in the case of an update.
Dynamic Apply
The biggest difference between kubectl's implementation of apply and Metacontroller's is that Metacontroller can emulate strategic merge inside CRDs.
For example, suppose you have a CRD with an embedded Pod template:
apiVersion: ctl.enisoc.com/v1
kind: CatSet # this resource is served via CRD
metadata:
name: my-catset
spec:
template: # embedded Pod template in CRD
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
name: web
You create this with apply:
kubectl apply -f catset.yaml
The promise of apply
is that it will "apply the changes you’ve made, without overwriting any automated changes to properties you haven’t specified".
As an example, suppose some other automation decides to edit your Pod template and add a sidecar container:
apiVersion: ctl.enisoc.com/v1
kind: CatSet # this resource is served via CRD
metadata:
name: my-catset
spec:
template: # embedded Pod template in CRD
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
name: web
- name: sidecar
image: log-uploader # fake sidecar example
Now suppose you change something in your local file and reapply it:
kubectl apply -f catset.yaml
Because kubectl apply
doesn't support strategic merge inside CRDs,
this will completely replace the containers
list with yours,
removing the sidecar
container.
By contrast, if this had been a Deployment or StatefulSet,
kubectl apply
would have preserved the sidecar
container.
As a result, if a controller uses kubectl's apply implementation with CRDs, that controller will fight against automation that tries to add sidecar containers or makes other changes to lists of objects that Kubernetes expects to be treated like associative arrays (ports, volumes, etc.).
To avoid this fighting, and to make the experience of using CRDs beter match that of native resources, Metacontroller uses an alternative implementation of apply logic that's based on convention instead of configuration.
Conventions
The main convention that Metacontroller enforces on apply semantics is how to detect and handle "associative lists".
In Kubernetes API conventions, an associative list is a list of objects or scalars that should be treated as if it were a map (associative array), but because of limitations in JSON/YAML it looks the same as an ordered list when serialized.
For native resources, kubectl apply
determines which lists are associative
lists by configuration: it must have compiled-in knowledge of all the resources,
and metadata about how each of their fields should be treated.
There is currently no mechanism for CRDs to specify this metadata,
which is why kubectl apply
falls back to assuming all lists are "atomic",
and should never be merged (only replaced entirely).
Even if there were a mechanism for CRDs to specify metadata for every field
(e.g. through extensions to OpenAPI),
it's not clear that it makes sense to require every CRD author to do so
in order for their resources to behave correctly when used with kubecl apply
.
One alternative that has been considered for such "schemaless CRDs" is to
establish a convention -- as long as your CRD follows the convention, you
don't need to provide configuration.
Metacontroller implements one such convention that empirically handles many common cases encountered when embedding Pod templates in CRDs (although it has limitations), developed by surveying the use of associative lists across the resources built into Kubernetes:
- A list is detected as an associative list if and only if all of the
following conditions are met:
- All items in the list are JSON objects (not scalars, nor other lists).
- All objects in the list have some field name in common,
where that field name is one of the conventional merge keys
(most commonly
name
).
- If a list is detected as an associative list, the conventional
field name that all objects have in common (e.g.
name
) is used as the merge key.- If more than one conventional merge key might work, pick only one according to a fixed order.
This allows Metacontroller to "do the right thing" in the majority of cases, without requiring advance knowledge about the resources it's working with -- knowledge that's not available anywhere in the case of CRDs.
In the future, Metacontroller will likely switch from this custom apply implementation to server-side apply, which is trying to solve the broader problem for all components that interact with the Kubernetes API. However, it's not yet clear whether that proposal will embrace schemaless CRDs and support apply semantics on them.
Limitations
A convention-based approach is necessarily more limiting than the native apply implementation, which supports arbitrary per-field configuration. The trade-off is that conventions reduce boilerplate and lower the barrier to entry for simple use cases.
This section lists some examples of configurations that the native apply allows, but are currently not supported in Metacontroller's convention-based apply. If any of these are blockers for you, please file an issue describing your use case.
- Atomic object lists
- A list of objects that share one of the conventional keys, but should nevertheless be treated atomically (replaced rather than merged).
- Unconventional associative list keys
- An associative list that doesn't use one of the conventional keys.
- Multi-field associative list keys
- A key that's composed of two or more fields (e.g. both
port
andprotocol
).
- A key that's composed of two or more fields (e.g. both
- Scalar-valued associative lists
- A list of scalars (not objects) that should be merged as if the scalar values were field names in an object.