218 lines
12 KiB
Markdown
218 lines
12 KiB
Markdown
# OneLab GitOps (Argo CD)
|
||
|
||
This directory is the **declarative source** for OneLab on Kubernetes. Argo CD applies two **Helm-based sources** from Git (Argo invokes Helm internally; you do not run a separate Helm install workflow).
|
||
|
||
Legacy Swarm install lives under [`app/`](../app/) (`docker-compose.yml`); this tree replaces `docker stack deploy` for k3s/Kubernetes.
|
||
|
||
## Layout
|
||
|
||
| Path | Purpose |
|
||
|------|---------|
|
||
| [`charts/onelab`](charts/onelab) | OneLab chart (StatefulSets, Deployments, Services, ConfigMaps, Secrets) — **Argo source 1** |
|
||
| [`values/`](values/) | Environment values (e.g. [`values/k3s-example.yaml`](values/k3s-example.yaml)); reference from `helm.valueFiles` |
|
||
| [`observability/`](observability/) | Loki / Promtail / Grafana umbrella chart — **Argo source 2** (`releaseName: onelab-obs`) |
|
||
| [`argocd/application.yaml`](argocd/application.yaml) | `Application` manifest (`spec.sources`, namespace `onelab`) |
|
||
| [`argocd/jsonpatch-multisource.json`](argocd/jsonpatch-multisource.json) | One-time JSON patch if the live `Application` stuck on `spec.source` |
|
||
|
||
## Prerequisites
|
||
|
||
1. **Kubernetes** (e.g. k3s) with a default **StorageClass** for Postgres/Rabbit PVCs (e.g. `local-path`).
|
||
2. **Image pull** to `hub.andrewalliance.com` — registry Secret + `imagePullSecrets` (see [`values/k3s-example.yaml`](values/k3s-example.yaml) and [Private registry credentials](#private-registry-credentials)).
|
||
3. **RabbitMQ TLS** Secret `onelab-rabbit-tls` (or `rabbitmq.tls.embed` in a private values file) — [RabbitMQ TLS](#rabbitmq-tls).
|
||
4. **Host paths** when using `persistence.mode: hostPath`: `/opt/onelab/data` and `/opt/onelab/logs` on nodes that run those pods, or use RWX storage for multi-node.
|
||
|
||
## Bootstrap (registry, Argo repo, TLS)
|
||
|
||
### Private registry credentials
|
||
|
||
By default, `gitops/values/k3s-example.yaml` matches the Swarm installer (`app/playbooks/tasks/manage-images.yml`): user **`public`**, password **`Andrew01..Release`**, and the chart creates Secret **`hub-andrewalliance`** when `registry.createPullSecret: true`.
|
||
|
||
To use other credentials, override `registry.username` / `registry.password` or create the secret manually:
|
||
|
||
```bash
|
||
kubectl create secret docker-registry hub-andrewalliance -n onelab \
|
||
--docker-server=hub.andrewalliance.com \
|
||
--docker-username='YOUR_USER' \
|
||
--docker-password='YOUR_PASSWORD'
|
||
```
|
||
|
||
…and set `registry.createPullSecret: false` plus `imagePullSecrets: [{ name: hub-andrewalliance }]`.
|
||
|
||
#### StatefulSet pods still get `401 Unauthorized` / `ImagePullBackOff` after enabling registry auth
|
||
|
||
If `db-0` / `rabbitmq-0` were created **before** `imagePullSecrets` existed, their **Pod** spec can still use anonymous pulls until they are recreated:
|
||
|
||
```bash
|
||
kubectl delete pod -n onelab db-0 rabbitmq-0
|
||
```
|
||
|
||
The chart adds a pod-template checksum so after you change registry settings in Git and **Argo syncs**, workloads normally roll; a one-time delete is enough if pods were created before pull secrets existed.
|
||
|
||
### Argo CD private Git repository
|
||
|
||
If the Application shows `authentication required: Unauthorized`, register the repo in Argo CD (CLI or UI):
|
||
|
||
```bash
|
||
# Example; use a deploy token or PAT with repo read access
|
||
argocd repo add https://git.luneski.fr/luneski/onelab-k8s.git \
|
||
--username git \
|
||
--password YOUR_TOKEN
|
||
```
|
||
|
||
Then apply the Application:
|
||
|
||
```bash
|
||
kubectl apply -f gitops/argocd/application.yaml
|
||
```
|
||
|
||
**Single controller:** Use **only** this Argo CD `Application` for `onelab` / `onelab-obs`. Do not manage the same namespace with a separate **Helm CLI** release.
|
||
|
||
### RabbitMQ TLS
|
||
|
||
Secret `onelab-rabbit-tls` must exist before RabbitMQ starts (created once from `app/rabbit/ssl/` or your own PEMs).
|
||
|
||
### Argo CD version and observability stack
|
||
|
||
[`argocd/application.yaml`](argocd/application.yaml) uses **`spec.sources`** (two Helm charts in one Application). Use **Argo CD 2.6 or newer**.
|
||
|
||
If the `onelab` Application was created earlier with **`spec.source` only**, Argo will **not** show the observability resources until you remove `source` and set `sources` — see [Migrating `spec.source` → `spec.sources`](#migrating-specsource--specsources) below.
|
||
|
||
The second source installs Loki/Promtail/Grafana from [`observability/`](observability/) (`releaseName: onelab-obs`). Set a strong **`grafana.adminPassword`** in [`observability/values.yaml`](observability/values.yaml) before production — details in [Observability](#observability-loki--promtail--grafana).
|
||
|
||
## Deploy with Argo CD
|
||
|
||
1. Push this repo to a Git remote Argo CD can read.
|
||
2. Register the repo in Argo CD (CLI or UI) if it is private — [Argo CD private Git repository](#argo-cd-private-git-repository).
|
||
3. Edit [`argocd/application.yaml`](argocd/application.yaml): `repoURL`, `targetRevision`, and per-source `helm.valueFiles` if needed.
|
||
4. Apply the Application:
|
||
|
||
```bash
|
||
kubectl apply -f gitops/argocd/application.yaml
|
||
```
|
||
|
||
**Requirements:** Argo CD **2.6+** (`spec.sources`).
|
||
|
||
Each entry under `spec.sources` has its own `helm.releaseName` and `helm.valueFiles` (paths are **relative to that source’s `path`**):
|
||
|
||
- Source `gitops/charts/onelab` → e.g. `../../values/k3s-example.yaml`
|
||
- Source `gitops/observability` → e.g. `values.yaml`
|
||
|
||
Both targets deploy into namespace **`onelab`**. Sync waves order: Postgres → Redis/Rabbit/config → application workloads.
|
||
|
||
### Migrating `spec.source` → `spec.sources`
|
||
|
||
If the `onelab` `Application` was created earlier with **`spec.source` only**, a plain `kubectl apply` of the new file may **not** remove `spec.source`, and Argo will never reconcile the observability chart.
|
||
|
||
Check:
|
||
|
||
```bash
|
||
kubectl get application onelab -n argocd -o jsonpath='{.spec.source}{"\n"}{.spec.sources}{"\n"}'
|
||
```
|
||
|
||
If `source` is set and `sources` is empty, patch once (adjust `repoURL` in the patch file if needed):
|
||
|
||
```bash
|
||
kubectl patch application onelab -n argocd --type json --patch-file gitops/argocd/jsonpatch-multisource.json
|
||
```
|
||
|
||
Then sync in Argo (or wait for auto-sync).
|
||
|
||
### Single controller
|
||
|
||
Manage these workloads **only** through this Argo CD `Application`. Do not drive the same resources with a parallel **Helm CLI** release.
|
||
|
||
### Logs / Grafana
|
||
|
||
See [Observability (Loki / Promtail / Grafana)](#observability-loki--promtail--grafana) — set a strong `grafana.adminPassword` in [`observability/values.yaml`](observability/values.yaml) before production.
|
||
|
||
## Observability (Loki / Promtail / Grafana)
|
||
|
||
The umbrella chart under [`observability/`](observability/) deploys:
|
||
|
||
- **Loki** — log storage (SingleBinary, filesystem PVC, 7-day retention by default).
|
||
- **Promtail** — DaemonSet: Kubernetes pod logs (`/var/log/pods`) plus **OneLab file logs** from the same host path the app chart uses (`/opt/onelab/logs` by default).
|
||
- **Grafana** — explore logs; datasource points at this release’s Loki gateway.
|
||
|
||
It is synced by the **same** Argo CD Application as the OneLab chart ([`argocd/application.yaml`](argocd/application.yaml)): second `sources` entry, Argo **`helm.releaseName`** **`onelab-obs`** (so services are like `onelab-obs-loki-gateway`).
|
||
|
||
### First-time setup
|
||
|
||
1. **Change the Grafana admin password** in [`observability/values.yaml`](observability/values.yaml) (`grafana.adminPassword`) or switch to `admin.existingSecret` per the upstream Grafana chart.
|
||
2. **Align host paths** — if you change `persistence.hostPath.logs` for OneLab, update `promtail.extraVolumes` / `extraVolumeMounts` in the same `values.yaml` so Promtail still reads the shared log directory.
|
||
3. **Multi-node** — with `hostPath` logs, each node only sees its own files; Promtail runs on every node, so you still get coverage when pods move.
|
||
|
||
### OneLab-only ingestion
|
||
|
||
Promtail adds **`extraRelabelConfigs`** so the **kubernetes-pods** job **keeps only** pods in namespace **`onelab`**. Other namespaces no longer reach Loki (Explore only sees OneLab). Host file logs under `/opt/onelab/logs` are tagged with **`namespace: onelab`** and **`component: host-logs`** so they appear in the same queries.
|
||
|
||
Existing Loki data from before this change may still show non-`onelab` streams until **retention** drops them; for a clean index you would need to wipe the Loki PVC (destructive).
|
||
|
||
### Dashboard: **OneLab logs**
|
||
|
||
Grafana’s **dashboard sidecar** loads ConfigMap **`…-dashboard-onelab-logs`** (JSON: `observability/dashboards/onelab-logs.json`). Open **Dashboards → OneLab logs** (`uid` `onelab-logs`):
|
||
|
||
- **Component** — multi-select from `label_values({namespace="onelab"}, component)` (includes **`host-logs`** for file logs).
|
||
- **Line filter** — regex applied to log line content (`.*` = all).
|
||
- Stat panels: total lines, heuristic **error** / **warning** counts (tuned for typical text logs, not strict JSON parsing).
|
||
|
||
#### Grafana pod: `init-chown-data` CrashLoopBackOff
|
||
|
||
The upstream chart runs an init container as **root** to `chown` `/var/lib/grafana`. Clusters with **Pod Security Admission** (often on k3s) commonly block that. This repo sets **`grafana.initChownData.enabled: false`**; the Grafana pod keeps **`fsGroup: 472`** so the PVC is usually group-writable. If Grafana still cannot write to disk, delete the Grafana PVC once after the change or relax PSA for namespace `onelab`.
|
||
|
||
### Access Grafana
|
||
|
||
An **Ingress** named **`grafana-onelab`** is created by the umbrella chart (`observability/templates/ingress-grafana-onelab.yaml`), Traefik + cert-manager, matching the OneLab web UI pattern in `gitops/values/k3s-example.yaml`:
|
||
|
||
- Host: **`grafana.k8s.selair.it`** — edit `grafanaOnelabIngress` and `grafana.ini.server` in `gitops/observability/values.yaml` together.
|
||
- TLS Secret: **`grafana-tls-k8s-selair`** (cert-manager with `letsencrypt-prod`).
|
||
|
||
Point DNS at your ingress, sync the app, then open `https://<grafana-host>/` (user `admin` until you change values).
|
||
|
||
For debugging without DNS:
|
||
|
||
```bash
|
||
kubectl -n onelab port-forward svc/onelab-obs-grafana 3000:80
|
||
```
|
||
|
||
### Maintainers: vendored chart dependencies
|
||
|
||
The observability umbrella vendors upstream charts under `gitops/observability/charts/*.tgz` so **Argo CD** can render without relying on live Helm repo access at sync time.
|
||
|
||
When bumping Loki / Promtail / Grafana versions, from `gitops/observability/` run:
|
||
|
||
```bash
|
||
helm dependency update
|
||
```
|
||
|
||
Commit the updated `Chart.lock` and `charts/*.tgz` with your Git change. This is **repository packaging**, not an alternative install path — deploy still happens only via Argo CD.
|
||
|
||
### OneLab `logs.path`
|
||
|
||
The OneLab chart sets `onelab.logs.path: "/logs"` in the generated configuration so application file logs match the `/logs` volume mount (see Enterprise guide §7.2).
|
||
|
||
## kubectl / credentials
|
||
|
||
If `kubectl` reports *You must be logged in*, refresh your kubeconfig (e.g. k3s `/etc/rancher/k3s/k3s.yaml` on the server or your auth plugin) before applying manifests.
|
||
|
||
## Application configuration (`configurations.yml`)
|
||
|
||
You do not need to edit [`app/configurations.yml`](../app/configurations.yml) in Git for Kubernetes. The chart renders `configurations.yml` from [`charts/onelab/files/configurations.gotmpl`](charts/onelab/files/configurations.gotmpl) into Secret **`onelab-configurations`**.
|
||
|
||
1. **Values (recommended)** — set `onelab.compliance`, `onelab.ldap`, etc. See [`values/instance-overrides.example.yaml`](values/instance-overrides.example.yaml). Add extra paths under **`spec.sources[].helm.valueFiles`** for the `gitops/charts/onelab` source (paths relative to `gitops/charts/onelab`).
|
||
2. **Bring your own Secret** — set `configuration.existingSecretName`; the Secret must contain key **`configurations.yml`**.
|
||
|
||
LDAP TLS paths in values are container paths; mount PEMs on `ldap-worker` if required.
|
||
|
||
## Ingress (web UI)
|
||
|
||
Set `ingress.enabled`, `ingress.host`, and optional TLS in values. Traffic goes to Service **`revproxy`**. On k3s, `ingress.className: traefik` matches the default controller. For cert-manager, set `ingress.tls`, `ingress.tlsSecretName`, and `ingress.certManager.clusterIssuer`; DNS for `ingress.host` must resolve before ACME runs.
|
||
|
||
## Developer note (local render)
|
||
|
||
Running **`helm template` on Windows** against some paths can return empty `.Files.Get` content; the OneLab chart uses `fromYaml (.Files.AsConfig)` where needed. **Argo CD runs on Linux** and renders the same charts in-cluster — this is a local-tooling caveat, not a second deploy path.
|
||
|
||
## Not migrated in this chart
|
||
|
||
- **Edge proxy stack** (`app/proxy/docker-compose.yml`, host 80/443 Swarm) — use **Ingress** + `revproxy` and optional cert-manager.
|
||
- **Swarm-only secrets** (e.g. `ssl_passphrase`) — use Kubernetes Secrets or external operators.
|