Files
onelab-k8s-1.27/gitops/docs/OBSERVABILITY.md
timotheereausanofi 4f66f7f7ed feat(observability): OneLab-only Promtail, provisioned OneLab logs dashboard
- Promtail: keep kubernetes-pods in namespace onelab; tag host file logs (host-logs)
- Grafana: enable dashboard sidecar; ConfigMap onelab-logs.json
- Dashboard: stats (total/error/warn heuristics), logs panel, component + regex filters

Made-with: Cursor
2026-03-20 11:28:47 +01:00

3.4 KiB
Raw Blame History

Observability (Loki / Promtail / Grafana)

The umbrella chart under gitops/observability/ deploys:

  • Loki — log storage (SingleBinary, filesystem PVC, 7-day retention by default).
  • Promtail — DaemonSet: Kubernetes pod logs (/var/log/pods) plus OneLab file logs from the same host path the app chart uses (/opt/onelab/logs by default).
  • Grafana — explore logs; datasource points at this releases Loki gateway.

It is synced by the same Argo CD Application as the OneLab chart (gitops/argocd/application.yaml): second sources entry, Helm release name onelab-obs (so services are like onelab-obs-loki-gateway).

First-time setup

  1. Change the Grafana admin password in gitops/observability/values.yaml (grafana.adminPassword) or switch to admin.existingSecret per the upstream Grafana chart.
  2. Align host paths — if you change persistence.hostPath.logs for OneLab, update promtail.extraVolumes / extraVolumeMounts in the same values.yaml so Promtail still reads the shared log directory.
  3. Multi-node — with hostPath logs, each node only sees its own files; Promtail runs on every node, so you still get coverage when pods move.

OneLab-only ingestion

Promtail adds extraRelabelConfigs so the kubernetes-pods job keeps only pods in namespace onelab. Other namespaces no longer reach Loki (Explore only sees OneLab). Host file logs under /opt/onelab/logs are tagged with namespace: onelab and component: host-logs so they appear in the same queries.

Existing Loki data from before this change may still show non-onelab streams until retention drops them; for a clean index you would need to wipe the Loki PVC (destructive).

Dashboard: OneLab logs

Grafanas dashboard sidecar loads ConfigMap …-dashboard-onelab-logs (JSON: dashboards/onelab-logs.json). Open Dashboards → OneLab logs (uid onelab-logs):

  • Component — multi-select from label_values({namespace="onelab"}, component) (includes host-logs for file logs).
  • Line filter — regex applied to log line content (.* = all).
  • Stat panels: total lines, heuristic error / warning counts (tuned for typical text logs, not strict JSON parsing).

Access Grafana

An Ingress named grafana-onelab is created by the umbrella chart (templates/ingress-grafana-onelab.yaml), Traefik + cert-manager, matching the OneLab web UI pattern in gitops/values/k3s-example.yaml:

  • Host: grafana.k8s.selair.it — edit grafanaOnelabIngress and grafana.ini.server in gitops/observability/values.yaml together.
  • TLS Secret: grafana-tls-k8s-selair (cert-manager with letsencrypt-prod).

Point DNS at your ingress, sync the app, then open https://<grafana-host>/ (user admin until you change values).

For debugging without DNS:

kubectl -n onelab port-forward svc/onelab-obs-grafana 3000:80

Upgrading chart dependencies

From gitops/observability/:

helm dependency update

Commit updated Chart.lock and charts/*.tgz if you want Argo to render without calling remote Helm repos at sync time.

OneLab logs.path

The OneLab chart now sets onelab.logs.path: "/logs" in the generated configuration so application file logs match the /logs volume mount (see Enterprise guide §7.2).