Alois & Alex Jones (Open Feature folks)
“Kubernetes is becoming the POSIX of the modern age”
Separates “infra primitives” (load balancer, object storage) from “app primatives”.
day 1: launch control. Deploying & releasing a new service.
- provisioning the deployment
- Has a lot of toil b/c we don’t segment infra and app concerns in observability
day 2: mission control. handle unplanned events (not related to releases)
- define SLOs
- configure alerts
- Modify the dsired state (scale, disable, etc)
- provide support for imperitive changes (for breakfix)
AIOps
- “context-less alerts”
- understanding “failure propagation”
To move from day 1 to day 2:
- add declarative monitoring as part of the system configuration
- hook in your o11y as part of the k8s config
- support for remediation job to rectify problems automatically. This is problematic in gitops b/c the system will self-heal back to the bad state. TAG App Delivery is working on this. They’re looking into “gitops reverse sync” to push changes back to git.
SLOs & error budgets are the input for corrective actions.
CNCF projects (examples, not exhaustive):
- argo to do gitops sync
- cross plane to deal w/ cross-cloud issues https://crossplane.io/
- o11y aggregation (fluentd, jaeger, prometheus)
- app repos (keptn)
Crossplane has support for things like “infrastructure provider” which can be custom-written to handle bare metal things.