Files
onelab-k8s-1.27/gitops/docs/OBSERVABILITY.md
timotheereausanofi 4f66f7f7ed feat(observability): OneLab-only Promtail, provisioned OneLab logs dashboard
- Promtail: keep kubernetes-pods in namespace onelab; tag host file logs (host-logs)
- Grafana: enable dashboard sidecar; ConfigMap onelab-logs.json
- Dashboard: stats (total/error/warn heuristics), logs panel, component + regex filters

Made-with: Cursor
2026-03-20 11:28:47 +01:00

59 lines
3.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Observability (Loki / Promtail / Grafana)
The umbrella chart under [`gitops/observability/`](../observability/) deploys:
- **Loki** — log storage (SingleBinary, filesystem PVC, 7-day retention by default).
- **Promtail** — DaemonSet: Kubernetes pod logs (`/var/log/pods`) plus **OneLab file logs** from the same host path the app chart uses (`/opt/onelab/logs` by default).
- **Grafana** — explore logs; datasource points at this releases Loki gateway.
It is synced by the **same** Argo CD Application as the OneLab chart ([`gitops/argocd/application.yaml`](../argocd/application.yaml)): second `sources` entry, Helm release name **`onelab-obs`** (so services are like `onelab-obs-loki-gateway`).
## First-time setup
1. **Change the Grafana admin password** in [`gitops/observability/values.yaml`](../observability/values.yaml) (`grafana.adminPassword`) or switch to `admin.existingSecret` per the upstream Grafana chart.
2. **Align host paths** — if you change `persistence.hostPath.logs` for OneLab, update `promtail.extraVolumes` / `extraVolumeMounts` in the same `values.yaml` so Promtail still reads the shared log directory.
3. **Multi-node** — with `hostPath` logs, each node only sees its own files; Promtail runs on every node, so you still get coverage when pods move.
## OneLab-only ingestion
Promtail adds **`extraRelabelConfigs`** so the **kubernetes-pods** job **keeps only** pods in namespace **`onelab`**. Other namespaces no longer reach Loki (Explore only sees OneLab). Host file logs under `/opt/onelab/logs` are tagged with **`namespace: onelab`** and **`component: host-logs`** so they appear in the same queries.
Existing Loki data from before this change may still show non-`onelab` streams until **retention** drops them; for a clean index you would need to wipe the Loki PVC (destructive).
## Dashboard: **OneLab logs**
Grafanas **dashboard sidecar** loads ConfigMap **`…-dashboard-onelab-logs`** (JSON: `dashboards/onelab-logs.json`). Open **Dashboards → OneLab logs** (`uid` `onelab-logs`):
- **Component** — multi-select from `label_values({namespace="onelab"}, component)` (includes **`host-logs`** for file logs).
- **Line filter** — regex applied to log line content (`.*` = all).
- Stat panels: total lines, heuristic **error** / **warning** counts (tuned for typical text logs, not strict JSON parsing).
## Access Grafana
An **Ingress** named **`grafana-onelab`** is created by the umbrella chart (`templates/ingress-grafana-onelab.yaml`), Traefik + cert-manager, matching the OneLab web UI pattern in `gitops/values/k3s-example.yaml`:
- Host: **`grafana.k8s.selair.it`** — edit `grafanaOnelabIngress` and `grafana.ini.server` in `gitops/observability/values.yaml` together.
- TLS Secret: **`grafana-tls-k8s-selair`** (cert-manager with `letsencrypt-prod`).
Point DNS at your ingress, sync the app, then open `https://<grafana-host>/` (user `admin` until you change values).
For debugging without DNS:
```bash
kubectl -n onelab port-forward svc/onelab-obs-grafana 3000:80
```
## Upgrading chart dependencies
From `gitops/observability/`:
```bash
helm dependency update
```
Commit updated `Chart.lock` and `charts/*.tgz` if you want Argo to render without calling remote Helm repos at sync time.
## OneLab `logs.path`
The OneLab chart now sets `onelab.logs.path: "/logs"` in the generated configuration so application file logs match the `/logs` volume mount (see Enterprise guide §7.2).