If you’re running a shared observability platform, you’ve probably faced this question: How do I stop Team A from seeing Team B’s data?
Grafana is fantastic for visualization, but the datasources behind it (Prometheus, Loki, Tempo, and their derivatives) weren’t designed with fine-grained multi-tenancy in mind. You typically get one of two options:
- Basic authentication: Everyone with access sees everything
- Separate instances: Operational overhead multiplied by N teams
Neither scales well. Enterprise solutions exist, but they’re expensive and often locked to specific vendors. Building custom authorization middleware means maintaining bespoke code for each query language. Most organizations end up with some combination of “trust people not to query the wrong namespaces” and “hope for the best.”
We ran into this exact problem at work. Multiple teams sharing a single observability stack, and no good way to enforce boundaries without spinning up separate instances for everyone. So we built Janus.
Together with Erik Kaisler and Irfan Hadzijusufovic, we’ve been working on this for a while, and we’re now releasing it as open source under the AGPLv3 license. A big thank you to our employer, evoila, for giving us the time to work on this.
What Janus Does
Janus sits between Grafana and your datasources, intercepting every query. It extracts identity from OAuth2 tokens, maps that identity to a set of allowed label values, and automatically injects label filters into queries before they reach the backend.
Your users write queries like normal. They don’t need to know about tenant boundaries or remember to add namespace filters. Janus handles it transparently.
The biggest win for us: we can now use a single dashboard for all teams, and everyone only sees data relevant to them. Janus even filters the label values returned by the API, so dropdowns in Grafana are only populated with values you’re actually allowed to query. No more scrolling through namespaces you can’t access anyway.
# User writes:
rate(http_requests_total[5m])
# Janus rewrites to:
rate(http_requests_total{namespace=~"team-a|team-b"}[5m])
The same principle applies across all three query languages:
- PromQL for metrics (Prometheus, Thanos, Mimir, VictoriaMetrics)
- LogQL for logs (Loki)
- TraceQL for traces (Tempo)
How It Works
Janus is a reverse proxy that does three things:
1. Identity Extraction
When a request arrives, Janus extracts the user’s identity from the OAuth2 access token. This can come from JWT claims, token introspection, or userinfo endpoints, whatever your identity provider supports.
2. Policy Resolution
Based on the extracted identity (user, groups, roles, custom claims), Janus determines which label values the user is permitted to query. Policies are defined declaratively:
thanos:
tenant-header-constraints:
order-service-team:
header:
- X-Scope-OrgID: tenant1
my-service-team:
header:
- X-Scope-OrgID: tenant2
regex-team:
header:
- X-Scope-OrgID: tenant-regex
multi-group-user:
header:
- X-Scope-OrgID: tenant3
admin:
header:
- X-Scope-OrgID: admin-tenant
user-label-constraints:
# Regex pattern: starts with "demo" (anchored)
order-service-team:
labels:
- "*"
namespace:
- "~^demo.*$"
# multi-group-user has access to BOTH demo AND observability
multi-group-user:
labels:
- "*"
namespace:
- demo
- observability
3. Query Rewriting
Before forwarding the request, Janus parses the query, injects the appropriate label matchers, and reconstructs it. This happens transparently, the datasource receives a valid, scoped query, the user receives results they’re authorized to see.
Why Label-Based Filtering?
Labels are the universal primitive in modern observability. Kubernetes workloads are labeled by namespace, team, environment. Logs carry metadata about their source. Traces propagate context through service boundaries.
By filtering at the label level, Janus works with your existing data model. No need to restructure your metrics or maintain separate Loki tenants. If you can express a boundary as a label selector, Janus can enforce it.
This also means enforcement happens at query time, not ingestion time. You can retroactively apply access controls without re-ingesting historical data.
Deployment
Janus is designed for Kubernetes-native deployment, though it runs anywhere you can run a container.
Without Janus
In a typical setup, Grafana connects directly to each datasource:
flowchart LR
G[Grafana] --> P[Prometheus/Mimir/Thanos]
G --> L[Loki]
G --> T[Tempo]
Every user with Grafana access can query any data in the backend-no tenant isolation.
With Janus
Janus sits between Grafana and your datasources, enforcing access control transparently:
flowchart LR
G[Grafana] --> J[Janus]
J --> P[Prometheus/Mimir/Thanos]
J --> L[Loki]
J --> T[Tempo]
J <--> IDP[Identity Provider]
IDP <--> G
Query Flow
Here’s what happens when a user runs a query:
sequenceDiagram
participant U as User
participant G as Grafana
participant J as Janus
participant DS as Datasource
U->>G: Execute query
G->>J: Forward query + OAuth2 token
J->>J: Validate token / get claims
J->>J: Resolve policy → allowed labels
J->>J: Rewrite query with label filters
J->>DS: Execute scoped query
DS-->>J: Filtered results
J-->>G: Return results
G-->>U: Display dashboard
Configure Grafana to use Janus as the datasource URL instead of pointing directly at your backends. Janus handles the rest.
Pre-built container images are available from the repository.
The Query Rewriting Challenge
The interesting part was supporting three different query languages. Each has different syntax, different semantics for label matching, and different edge cases.
PromQL label matchers are relatively straightforward, inject {label=~"value1|value2"} and merge with existing selectors. LogQL adds pipeline stages that need to be preserved. TraceQL has its own structural query syntax for spans and traces.
Janus uses dedicated parsers for each language rather than regex manipulation. This ensures queries remain valid after rewriting and handles edge cases like nested expressions, subqueries, and aggregations correctly.
To make sure we don’t break anything on updates, we’ve built extensive end-to-end tests that run against real Prometheus, Loki, and Tempo instances with example data. This way we catch regressions before they ship, not when someone’s dashboard stops working.
What It Doesn’t Do
- Rate limiting: Use your existing ingress or API gateway for that
- Data masking: Janus filters which data you can query, not which fields are visible within results
- Write-path authorization: This is query-side only; ingestion controls are a separate concern
Getting Started
The repository includes a pre-built container image and documentation for the policy syntax.
- Run Janus as a container (Docker, Podman, or Kubernetes)
- Configure your policies to map users/groups to allowed label values
- Point your Grafana datasources at Janus instead of directly at your backends
One requirement: Users need to be authenticated via OAuth2. Janus extracts identity from the access token, so anonymous access won’t work.
Check out the repo for example configs.
Why AGPLv3?
We went with AGPLv3 to make sure improvements stay available to everyone. If you modify Janus and offer it as a service, those modifications need to be shared. We’ve seen too many useful infrastructure tools get absorbed into proprietary platforms and fragment the community.
For most users, running Janus internally to secure their own observability stack, the license has no practical impact beyond ensuring the project stays open.
What’s Next
The initial release covers the core use case: OAuth2 identity, label-based policies, and PromQL/LogQL/TraceQL rewriting. Possible Roadmap features could be :
- Additional identity sources (mTLS, API keys)
- Configurable Audit logging
- Caching layer for policy decisions
Contributions welcome, whether it’s bug reports, docs, or new features, the GitHub issues are open.
Janus is available now at GitHub. If tenant isolation has been sitting on your backlog, give it a try.
Questions, feedback, or war stories about observability access control? We’d love to hear them.
Why “Janus”? Every observability tool seems to be named after a god Prometheus, Loki, Thanos. We needed a gatekeeper, so we went with the Roman god of doorways and gates. Plus, he has two faces, which felt appropriate for a proxy that sees both sides of every request.