Data classification & access scope¶
Closes open question F-01 (Datenklassifizierung + Zonentrennung). Defines the classification scheme dashi enforces on every STAC item, the runtime checks, and how access is scoped to it.
The four levels¶
| Level | Code | Meaning | Examples |
|---|---|---|---|
| Public | pub |
Open data — no access restriction. Internet-publishable. | OSM, Sentinel-2, public DTM |
| Internal | int |
Operational data; access limited to the operating organisation. | In-house terrain models, internal route plans |
| Restricted | rst |
Sensitive — access on a need-to-know basis, audit-logged. | Asset locations, infrastructure surveys |
| Confidential | cnf |
Highly sensitive — explicit per-user grant + 2FA + watermarking. | Personal data, contractual data with NDA |
The four levels mirror the German TLP (Traffic Light Protocol) scheme (white / green / amber / red) without using TLP literally — TLP is an information-sharing protocol, not a data-classification one. Codes are 3-char ASCII so they fit STAC properties + path segments without escaping.
Where the level lives¶
Every STAC item carries:
json
{
"properties": {
"dashi:classification": "int",
"dashi:access_groups": ["dashi", "team-terrain"],
"dashi:retention": "1y",
"dashi:source_kind": "vector",
...
}
}
dashi:classification is mandatory. dashi:access_groups lists the OIDC groups (Authelia / IdP) that may read the asset; ignored when classification is pub.
Where it is enforced¶
| Layer | Enforcement |
|---|---|
| Ingest | dashi-ingest --classification <pub\|int\|rst\|cnf> flag (CLI + flow parameter). Default per domain set in docs/onboarding/domains.md. Item rejected if higher than the domain ceiling. |
| STAC catalog | Validator runs on every PUT/POST: missing dashi:classification → 400. |
| Object storage | Per-zone IAM is the floor. Each classification adds a path-prefix policy: rst and cnf items go under s3://processed/<domain>/<rst\|cnf>/... with separate read users. |
| Serving | TiTiler / Martin / TiPG / DuckDB endpoints sit behind oauth2-proxy (see poc/manifests/auth/); the proxy injects X-Forwarded-User + X-Forwarded-Groups headers, the upstream filters STAC results by group membership. |
| Backups | cnf dumps additionally pass through age-encryption with a per-cluster public key before being mirrored off-cluster. |
| Audit | dashi:classification ∈ {rst, cnf} STAC reads emit a Loki line via the serving sidecars; query {namespace="dashi-serving",classification=~"rst|cnf"} for the audit trail. |
Domain-default ceiling¶
docs/onboarding/domains.md gains a max_classification column. Items exceeding the domain's ceiling are rejected at ingest. Example:
markdown
| id | title | owner | retention | access | max_classification | …
| gelaende-umwelt | Terrain & environment | Marco | indefinite | internal | int | …
| weather-radar | DWD radolan | … | 90d | public | pub | …
| asset-locations | Internal asset survey | … | 1y | restricted | rst | …
Classification scheme is documented; runtime enforcement TBD¶
This document defines what the four levels mean and where they should be enforced. The actual code paths are scaffolded:
- STAC item supports
dashi:classification(free-form properties bag) -
dashi-ingest--classificationflag (open in FEATURE-IDEAS) - STAC validator hook (open: needs pgstac transaction-time validator)
- Per-classification IAM policy + bucket prefixes (open)
- oauth2-proxy header → group filter on serving endpoints (open)
-
cnfage-encrypted backup mirror (open)
Tracking: every checkbox above is a Phase-2-K follow-up. Until those land, dashi treats every item as int by default — the classification is declared via dashi:classification, not enforced.
Disposal¶
When an item's retention expires (per domain-template.md step 7) and its classification is rst or cnf:
- Object overwrite + DELETE on RustFS (versioning: tombstone versions also pruned)
- STAC item DELETE
- Loki audit line:
dashi.dispose item=<id> classification=<lvl> reason=retention - Off-cluster backup mirror: corresponding object pruned