Troubleshooting
This is a collection of tips for debugging controllers written with Metacontroller.
If you have something to add to the collection, please send a pull request against this document.
Events
As metacontroller emits kubernetes Events for internal actions, you might check events on parent object, like:
kubectl describe secretpropagations.examples.metacontroller.io <name>
where, at the end, you will see all events related with given parent:
Name: secret-propagation
Namespace:
Labels: <none>
Annotations: <none>
API Version: examples.metacontroller.io/v1alpha1
Kind: SecretPropagation
Metadata:
Creation Timestamp: 2021-07-14T20:25:09Z
...
Spec:
Source Name: shareable
Source Namespace: omega
Target Namespace Label Selector:
Match Labels:
Propagate: true
Status:
Working: fine
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning SyncError 1s (x11 over 8s) metacontroller Sync error: sync hook failed for SecretPropagation /secret-propagation: sync hook failed: http error: Post "http://secret-propagation-controller.metacontroller/sync": dial tcp 10.96.138.14:80: connect: connection refused
You can access also events using kubectl get events
, which return all events from given namespace. As metacontroller
CRD's are might be cluster wide, they can land in default
namespace:
> kubectl get events -n default
39m Normal Started compositecontroller/secret-propagation-controller Started controller: secret-propagation-controller
39m Normal Starting compositecontroller/secret-propagation-controller Starting controller: secret-propagation-controller
39m Normal Stopping compositecontroller/secret-propagation-controller Stopping controller: secret-propagation-controller
39m Normal Stopped compositecontroller/secret-propagation-controller Stopped controller: secret-propagation-controller
6m25s Normal Started compositecontroller/secret-propagation-controller Started controller: secret-propagation-controller
6m25s Normal Starting compositecontroller/secret-propagation-controller Starting controller: secret-propagation-controller
2m27s Normal Stopping compositecontroller/secret-propagation-controller Stopping controller: secret-propagation-controller
2m27s Normal Stopped compositecontroller/secret-propagation-controller Stopped controller: secret-propagation-controller
Metacontroller Logs
Until Metacontroller emits events, the first place to look when troubleshooting controller behavior is the logs for the Metacontroller server itself.
For example, you can fetch the last 25 lines with a command like this:
kubectl -n metacontroller logs --tail=25 -l app=metacontroller
Log Levels
You can customize the verbosity of the Metacontroller server's logs with the
--zap-log-level
flag.
At all log levels, Metacontroller will log the progress of server startup and shutdown, as well as major changes like starting and stopping hosted controllers.
At level 4 and above, Metacontroller will log actions (like create/update/delete) on individual objects (like Pods) that it takes on behalf of hosted controllers. It will also log when it decides to sync a given controller as well as events that may trigger a sync.
At level 5 and above, Metacontroller will log the diffs between existing objects, and the desired state of those objects returned by controller hooks.
At level 6 and above, Metacontroller will log every hook invocation as well as the JSON request and response bodies.
Common Log Messages
Since API discovery info is refreshed periodically, you may see log messages like this when you start a controller that depends on a recently-installed CRD:
failed to sync CompositeController "my-controller": discovery: can't find resource <resource> in apiVersion <group>/<version>
Usually, this should fix itself within about 30s when the new CRD is discovered. If this message continues indefinitely, check that the resource name and API group/version are correct.
You may also notice periodic log messages like this:
Watch close - *unstructured.Unstructured total <X> items received
This comes from the underlying client-go library, and just indicates when the shared caches are periodically flushed to place an upper bound on cache inconsistency due to potential silent failures in long-running watches.
Webhook Logs
If you return an HTTP error code (e.g., 500) from your webhook, the Metacontroller server will log the text of the response body.
If you need more detail on what's happening inside your hook code, as opposed to what Metacontroller does for you, you'll need to add log statements to your own code and inspect the logs on your webhook server.