> Kontext holds secrets server-side and mints short-lived tokens per session.
That probably makes this thing DOA for most people (certainly for me and everyone I know).
We'll think how to best accomodate full self-hosting in the future!
I imagine a lot of the organizations that would find this most valuable, and would be willing to pay a lot, would be the same ones that would require something like this.
It’s usually pretty doable to make a system self-hostable on a happy path. The hard part is supporting it across lots of customer environments without being in the loop every time: custom IdPs, private networking, KMS/HSM/BYOK requirements, upgrade/migration paths, backup/restore, observability, and all the weird edge cases that only show up once other people operate it.
And yes, I think your last point is right: the customers who care most about this category are often exactly the ones who will require self-hosted.
What's your take? Curious what you found effective vs. what you deem hardest from your experience.
What prevents the agent from presisering or leaking the API key - or reading it from the environment?
We need this also for normal usage like development environments. Or when invoking a command on a remote server.
Are you going to add support for services that don't support OIDC or this going to be a known limitation?
Workflow: OneCLI runs as a self-hosted Docker gateway — you route agent traffic through localhost:10255. Kontext doesn't change how you use Claude Code at all, just kontext start --agent claude.
Visibility layer: OneCLI intercepts outbound HTTP requests. Kontext hooks into Claude's PreToolUse/PostToolUse events, so you see bash commands, file ops, and API calls and not just network traffic.
Trust model tradeoff worth naming: OneCLI is fully self-hosted. Kontext holds secrets server-side and mints short-lived tokens per session. We do this via token exchange, RFC 8693, and natively build upon Oauth to support only handing over short-lived tokens - you don't need to capture refresh tokens for external tool calls at all.
So if Claude Code invokes Bash and runs curl ..., we see that tool invocation. If it invokes Bash and runs python script.py, and that script makes HTTP requests internally, we still see the Bash invocation.
Aperture solves “make multiple coding agents talk to the right LLM backend through an Aperture proxy.” We solve “launch a governed agent session with identity, short-lived third-party credentials, and tool-level auditability.” They overlap at the launcher layer, but the security goals are different.
I was actually just about to get started writing this but in Rust....
Nice work