Files
onelab-k8s-1.27/gitops/README.md
2026-03-20 12:06:20 +01:00

218 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# OneLab GitOps (Argo CD)
This directory is the **declarative source** for OneLab on Kubernetes. Argo CD applies two **Helm-based sources** from Git (Argo invokes Helm internally; you do not run a separate Helm install workflow).
Legacy Swarm install lives under [`app/`](../app/) (`docker-compose.yml`); this tree replaces `docker stack deploy` for k3s/Kubernetes.
## Layout
| Path | Purpose |
|------|---------|
| [`charts/onelab`](charts/onelab) | OneLab chart (StatefulSets, Deployments, Services, ConfigMaps, Secrets) — **Argo source 1** |
| [`values/`](values/) | Environment values (e.g. [`values/k3s-example.yaml`](values/k3s-example.yaml)); reference from `helm.valueFiles` |
| [`observability/`](observability/) | Loki / Promtail / Grafana umbrella chart — **Argo source 2** (`releaseName: onelab-obs`) |
| [`argocd/application.yaml`](argocd/application.yaml) | `Application` manifest (`spec.sources`, namespace `onelab`) |
| [`argocd/jsonpatch-multisource.json`](argocd/jsonpatch-multisource.json) | One-time JSON patch if the live `Application` stuck on `spec.source` |
## Prerequisites
1. **Kubernetes** (e.g. k3s) with a default **StorageClass** for Postgres/Rabbit PVCs (e.g. `local-path`).
2. **Image pull** to `hub.andrewalliance.com` — registry Secret + `imagePullSecrets` (see [`values/k3s-example.yaml`](values/k3s-example.yaml) and [Private registry credentials](#private-registry-credentials)).
3. **RabbitMQ TLS** Secret `onelab-rabbit-tls` (or `rabbitmq.tls.embed` in a private values file) — [RabbitMQ TLS](#rabbitmq-tls).
4. **Host paths** when using `persistence.mode: hostPath`: `/opt/onelab/data` and `/opt/onelab/logs` on nodes that run those pods, or use RWX storage for multi-node.
## Bootstrap (registry, Argo repo, TLS)
### Private registry credentials
By default, `gitops/values/k3s-example.yaml` matches the Swarm installer (`app/playbooks/tasks/manage-images.yml`): user **`public`**, password **`Andrew01..Release`**, and the chart creates Secret **`hub-andrewalliance`** when `registry.createPullSecret: true`.
To use other credentials, override `registry.username` / `registry.password` or create the secret manually:
```bash
kubectl create secret docker-registry hub-andrewalliance -n onelab \
--docker-server=hub.andrewalliance.com \
--docker-username='YOUR_USER' \
--docker-password='YOUR_PASSWORD'
```
…and set `registry.createPullSecret: false` plus `imagePullSecrets: [{ name: hub-andrewalliance }]`.
#### StatefulSet pods still get `401 Unauthorized` / `ImagePullBackOff` after enabling registry auth
If `db-0` / `rabbitmq-0` were created **before** `imagePullSecrets` existed, their **Pod** spec can still use anonymous pulls until they are recreated:
```bash
kubectl delete pod -n onelab db-0 rabbitmq-0
```
The chart adds a pod-template checksum so after you change registry settings in Git and **Argo syncs**, workloads normally roll; a one-time delete is enough if pods were created before pull secrets existed.
### Argo CD private Git repository
If the Application shows `authentication required: Unauthorized`, register the repo in Argo CD (CLI or UI):
```bash
# Example; use a deploy token or PAT with repo read access
argocd repo add https://git.luneski.fr/luneski/onelab-k8s.git \
--username git \
--password YOUR_TOKEN
```
Then apply the Application:
```bash
kubectl apply -f gitops/argocd/application.yaml
```
**Single controller:** Use **only** this Argo CD `Application` for `onelab` / `onelab-obs`. Do not manage the same namespace with a separate **Helm CLI** release.
### RabbitMQ TLS
Secret `onelab-rabbit-tls` must exist before RabbitMQ starts (created once from `app/rabbit/ssl/` or your own PEMs).
### Argo CD version and observability stack
[`argocd/application.yaml`](argocd/application.yaml) uses **`spec.sources`** (two Helm charts in one Application). Use **Argo CD 2.6 or newer**.
If the `onelab` Application was created earlier with **`spec.source` only**, Argo will **not** show the observability resources until you remove `source` and set `sources` — see [Migrating `spec.source` → `spec.sources`](#migrating-specsource--specsources) below.
The second source installs Loki/Promtail/Grafana from [`observability/`](observability/) (`releaseName: onelab-obs`). Set a strong **`grafana.adminPassword`** in [`observability/values.yaml`](observability/values.yaml) before production — details in [Observability](#observability-loki--promtail--grafana).
## Deploy with Argo CD
1. Push this repo to a Git remote Argo CD can read.
2. Register the repo in Argo CD (CLI or UI) if it is private — [Argo CD private Git repository](#argo-cd-private-git-repository).
3. Edit [`argocd/application.yaml`](argocd/application.yaml): `repoURL`, `targetRevision`, and per-source `helm.valueFiles` if needed.
4. Apply the Application:
```bash
kubectl apply -f gitops/argocd/application.yaml
```
**Requirements:** Argo CD **2.6+** (`spec.sources`).
Each entry under `spec.sources` has its own `helm.releaseName` and `helm.valueFiles` (paths are **relative to that sources `path`**):
- Source `gitops/charts/onelab` → e.g. `../../values/k3s-example.yaml`
- Source `gitops/observability` → e.g. `values.yaml`
Both targets deploy into namespace **`onelab`**. Sync waves order: Postgres → Redis/Rabbit/config → application workloads.
### Migrating `spec.source` → `spec.sources`
If the `onelab` `Application` was created earlier with **`spec.source` only**, a plain `kubectl apply` of the new file may **not** remove `spec.source`, and Argo will never reconcile the observability chart.
Check:
```bash
kubectl get application onelab -n argocd -o jsonpath='{.spec.source}{"\n"}{.spec.sources}{"\n"}'
```
If `source` is set and `sources` is empty, patch once (adjust `repoURL` in the patch file if needed):
```bash
kubectl patch application onelab -n argocd --type json --patch-file gitops/argocd/jsonpatch-multisource.json
```
Then sync in Argo (or wait for auto-sync).
### Single controller
Manage these workloads **only** through this Argo CD `Application`. Do not drive the same resources with a parallel **Helm CLI** release.
### Logs / Grafana
See [Observability (Loki / Promtail / Grafana)](#observability-loki--promtail--grafana) — set a strong `grafana.adminPassword` in [`observability/values.yaml`](observability/values.yaml) before production.
## Observability (Loki / Promtail / Grafana)
The umbrella chart under [`observability/`](observability/) deploys:
- **Loki** — log storage (SingleBinary, filesystem PVC, 7-day retention by default).
- **Promtail** — DaemonSet: Kubernetes pod logs (`/var/log/pods`) plus **OneLab file logs** from the same host path the app chart uses (`/opt/onelab/logs` by default).
- **Grafana** — explore logs; datasource points at this releases Loki gateway.
It is synced by the **same** Argo CD Application as the OneLab chart ([`argocd/application.yaml`](argocd/application.yaml)): second `sources` entry, Argo **`helm.releaseName`** **`onelab-obs`** (so services are like `onelab-obs-loki-gateway`).
### First-time setup
1. **Change the Grafana admin password** in [`observability/values.yaml`](observability/values.yaml) (`grafana.adminPassword`) or switch to `admin.existingSecret` per the upstream Grafana chart.
2. **Align host paths** — if you change `persistence.hostPath.logs` for OneLab, update `promtail.extraVolumes` / `extraVolumeMounts` in the same `values.yaml` so Promtail still reads the shared log directory.
3. **Multi-node** — with `hostPath` logs, each node only sees its own files; Promtail runs on every node, so you still get coverage when pods move.
### OneLab-only ingestion
Promtail adds **`extraRelabelConfigs`** so the **kubernetes-pods** job **keeps only** pods in namespace **`onelab`**. Other namespaces no longer reach Loki (Explore only sees OneLab). Host file logs under `/opt/onelab/logs` are tagged with **`namespace: onelab`** and **`component: host-logs`** so they appear in the same queries.
Existing Loki data from before this change may still show non-`onelab` streams until **retention** drops them; for a clean index you would need to wipe the Loki PVC (destructive).
### Dashboard: **OneLab logs**
Grafanas **dashboard sidecar** loads ConfigMap **`…-dashboard-onelab-logs`** (JSON: `observability/dashboards/onelab-logs.json`). Open **Dashboards → OneLab logs** (`uid` `onelab-logs`):
- **Component** — multi-select from `label_values({namespace="onelab"}, component)` (includes **`host-logs`** for file logs).
- **Line filter** — regex applied to log line content (`.*` = all).
- Stat panels: total lines, heuristic **error** / **warning** counts (tuned for typical text logs, not strict JSON parsing).
#### Grafana pod: `init-chown-data` CrashLoopBackOff
The upstream chart runs an init container as **root** to `chown` `/var/lib/grafana`. Clusters with **Pod Security Admission** (often on k3s) commonly block that. This repo sets **`grafana.initChownData.enabled: false`**; the Grafana pod keeps **`fsGroup: 472`** so the PVC is usually group-writable. If Grafana still cannot write to disk, delete the Grafana PVC once after the change or relax PSA for namespace `onelab`.
### Access Grafana
An **Ingress** named **`grafana-onelab`** is created by the umbrella chart (`observability/templates/ingress-grafana-onelab.yaml`), Traefik + cert-manager, matching the OneLab web UI pattern in `gitops/values/k3s-example.yaml`:
- Host: **`grafana.k8s.selair.it`** — edit `grafanaOnelabIngress` and `grafana.ini.server` in `gitops/observability/values.yaml` together.
- TLS Secret: **`grafana-tls-k8s-selair`** (cert-manager with `letsencrypt-prod`).
Point DNS at your ingress, sync the app, then open `https://<grafana-host>/` (user `admin` until you change values).
For debugging without DNS:
```bash
kubectl -n onelab port-forward svc/onelab-obs-grafana 3000:80
```
### Maintainers: vendored chart dependencies
The observability umbrella vendors upstream charts under `gitops/observability/charts/*.tgz` so **Argo CD** can render without relying on live Helm repo access at sync time.
When bumping Loki / Promtail / Grafana versions, from `gitops/observability/` run:
```bash
helm dependency update
```
Commit the updated `Chart.lock` and `charts/*.tgz` with your Git change. This is **repository packaging**, not an alternative install path — deploy still happens only via Argo CD.
### OneLab `logs.path`
The OneLab chart sets `onelab.logs.path: "/logs"` in the generated configuration so application file logs match the `/logs` volume mount (see Enterprise guide §7.2).
## kubectl / credentials
If `kubectl` reports *You must be logged in*, refresh your kubeconfig (e.g. k3s `/etc/rancher/k3s/k3s.yaml` on the server or your auth plugin) before applying manifests.
## Application configuration (`configurations.yml`)
You do not need to edit [`app/configurations.yml`](../app/configurations.yml) in Git for Kubernetes. The chart renders `configurations.yml` from [`charts/onelab/files/configurations.gotmpl`](charts/onelab/files/configurations.gotmpl) into Secret **`onelab-configurations`**.
1. **Values (recommended)** — set `onelab.compliance`, `onelab.ldap`, etc. See [`values/instance-overrides.example.yaml`](values/instance-overrides.example.yaml). Add extra paths under **`spec.sources[].helm.valueFiles`** for the `gitops/charts/onelab` source (paths relative to `gitops/charts/onelab`).
2. **Bring your own Secret** — set `configuration.existingSecretName`; the Secret must contain key **`configurations.yml`**.
LDAP TLS paths in values are container paths; mount PEMs on `ldap-worker` if required.
## Ingress (web UI)
Set `ingress.enabled`, `ingress.host`, and optional TLS in values. Traffic goes to Service **`revproxy`**. On k3s, `ingress.className: traefik` matches the default controller. For cert-manager, set `ingress.tls`, `ingress.tlsSecretName`, and `ingress.certManager.clusterIssuer`; DNS for `ingress.host` must resolve before ACME runs.
## Developer note (local render)
Running **`helm template` on Windows** against some paths can return empty `.Files.Get` content; the OneLab chart uses `fromYaml (.Files.AsConfig)` where needed. **Argo CD runs on Linux** and renders the same charts in-cluster — this is a local-tooling caveat, not a second deploy path.
## Not migrated in this chart
- **Edge proxy stack** (`app/proxy/docker-compose.yml`, host 80/443 Swarm) — use **Ingress** + `revproxy` and optional cert-manager.
- **Swarm-only secrets** (e.g. `ssl_passphrase`) — use Kubernetes Secrets or external operators.