Dev Server Migration — k3d Cluster Synchronization
Sprint 81: Dev Server Migration — k3d Cluster Synchronization
Decisions
D1: kube-state-metrics New Deployment (Base Manifest)
Context: kube-state-metrics:8080 scrape target is defined in Prometheus config, but the manifest for this resource was missing, causing scrape failures on both OCI and dev.
Choice: Created new Deployment/Service/ServiceAccount/ClusterRole/ClusterRoleBinding in infra/k3s/kube-state-metrics.yaml. Resource savings via --namespaces=algosu restriction.
Alternatives: (a) Install via Helm chart — inconsistent with kustomize-based operations, rejected. (b) Remove the scrape target entirely — makes k8s resource metrics collection (HPA/PDB, etc.) impossible, rejected.
D2: Remove stateful service securityContext in dev overlay
Context: k3d's local-path provisioner doesn't properly support fsGroup, causing postgres/rabbitmq/minio pods with securityContext.fsGroup: 999 to fail startup with permission errors after PVC mount.
Choice: Use JSON Patch in infra/overlays/dev/kustomization.yaml to remove pod-level securityContext for postgres, postgres-problem, minio, and rabbitmq. Run with image default uid.
Alternatives: (a) chown with initContainer — pod-level runAsNonRoot: true blocks root initContainer, rejected. (b) Remove securityContext from base — weakens production (OCI) security, rejected.
Code Paths: infra/overlays/dev/kustomization.yaml
D3: Dev overlay readinessProbe /health → fallback
Context: Current :dev images have /health/ready endpoint either unimplemented (404) or blocked by Gateway authentication middleware (401). readinessProbe failures keep pods permanently Not Ready.
Choice: Patch readinessProbe path to /health for 5 app services in dev overlay.
Code Paths: infra/overlays/dev/kustomization.yaml
Patterns
P1: dev overlay JSON Patch Pattern for k3d Compatibility
When to Reuse: When base manifest production settings are incompatible with k3d/local environments. Selectively remove/modify securityContext, readinessProbe, imagePullSecrets, etc. in dev overlay while keeping base unmodified.
Root Cause: problem-service-secrets.DATABASE_HOST still points to postgres (main), but problem-policy NetworkPolicy only allows egress to postgres-problem. Secret not updated after DB separation completed.
Fix: Update problem-service-secrets.DATABASE_HOST to postgres-problem and sync password with postgres-problem-secret.