K0KEYZERO
← All posts

Why AI Agents Leak Secrets (And How to Stop It)

The 5 most common ways AI agents leak secrets — from .env files on disk to context window exposure — and concrete mitigations for each vector.

·ai-security, secret-management, leak-prevention

AI agents are the fastest-growing consumers of API keys, database credentials, and service tokens. They are also the most dangerous holders of those secrets. Unlike traditional server processes, AI agents operate with broad autonomy, generate verbose logs, and maintain conversational context that persists across interactions. This creates five distinct leak vectors that most secret management practices were never designed to handle — and contributes to the growing problem of secret sprawl in AI-native development.

Vector 1: .env Files on Disk

The Mechanism

The most common way developers provide secrets to any process — including AI agents — is through .env files. These files sit in the project directory as plaintext, readable by any process with filesystem access.

Concrete Example

A developer creates .env with OPENAI_API_KEY=sk-proj-abc123... for their AI agent. The agent's code reads it via dotenv. The file is accidentally committed to Git, picked up by a filesystem indexer, or read by a malicious dependency running in the same environment.

Even when .env is in .gitignore, the file remains on disk in plaintext. Any process running under the same user — including the AI agent itself — can read, copy, or exfiltrate it.

Mitigation

Replace static files with runtime resolution. KeyZero's .keyzero.toml maps environment variable names to vault references, and kz run resolves them at process start:

[secrets]
OPENAI_API_KEY = { provider = "keychain", ref = "myapp-openai-key" }
kz run -- node agent.js

The secret exists only in the subprocess environment, never on disk as plaintext.

Vector 2: Logs and Traces

The Mechanism

AI agent frameworks produce extensive logs for debugging and observability. These logs capture environment variables, HTTP headers, request/response bodies, and internal state. Secrets injected as environment variables frequently appear in startup logs, crash dumps, and distributed traces.

Concrete Example

A LangChain agent configured with verbose=True logs every tool call, including the full HTTP request with Authorization: Bearer sk-proj-abc123... in the headers. These logs are shipped to Datadog, Splunk, or a plain log file — now the secret is stored in a second system with its own access controls (or lack thereof).

Mitigation

Use blind mode to ensure the agent process never holds the real secret value. With kz run --blind, the agent sees kz_masked_7f3a9b... instead of the real key. Even if every byte of memory is logged, no real credential is exposed.

Combine this with structured logging that explicitly redacts known secret patterns.

Vector 3: Context Windows

The Mechanism

LLM-based agents maintain a context window — the full conversation history that the model processes on each turn. If a secret value appears anywhere in this context (from a tool response, a user message, or a system prompt), it is sent to the LLM provider's API on every subsequent request.

Concrete Example

An agent resolves a database password and stores it in a variable. On the next turn, the agent's framework serializes the full state — including that variable — into the prompt sent to the LLM API. The secret now exists in the LLM provider's request logs, potentially in training data pipelines, and in any cached responses.

Mitigation

Secrets must never enter the context window. KeyZero's MCP fetch tool resolves credentials and makes HTTP requests on behalf of the agent — the agent receives the API response but never sees the raw credential:

{
  "mcpServers": {
    "keyzero": {
      "command": "kz",
      "args": ["server", "start", "--bundle", "./bundle.yaml", "--mcp"]
    }
  }
}

The agent calls fetch with a resource reference. KeyZero resolves the secret, injects it into the outbound request, and returns only the response body.

Vector 4: Tool Call Parameters

The Mechanism

AI agents invoke tools — web requests, database queries, file operations — by constructing parameters. When an agent holds a secret, it passes that secret as a tool parameter. These parameters are logged by the agent framework, visible in the LLM's reasoning traces, and often included in error reports.

Concrete Example

An agent constructs a curl-like tool call: http_request(url="https://api.stripe.com/v1/charges", headers={"Authorization": "Bearer sk_live_abc123..."}). The secret appears in the tool call's parameter list, which is part of the conversation, logged by the framework, and sent to the LLM for planning the next step.

Mitigation

Remove secrets from tool call parameters entirely. With KeyZero's fetch MCP tool, the agent specifies a resource reference (resource: "stripe/api-key") and a target URL. KeyZero handles credential injection at the network layer. The agent's tool call contains no secret material:

fetch(resource="stripe/api-key", resolver="stripe", url="https://api.stripe.com/v1/charges")

Vector 5: Error Messages

The Mechanism

When an API call fails, the error response often echoes back the request — including headers, query parameters, and authentication tokens. These error messages are captured by the agent, added to its context, and used for retry logic.

Concrete Example

An agent sends a request with an expired API key. The upstream service returns: {"error": "Invalid API key: sk-proj-abc123...", "status": 401}. The agent logs this error, includes it in its context for debugging, and may even present it to the user. The secret is now leaked through an error path that most developers never audit.

Mitigation

When using KeyZero's blind mode, the request that reaches the upstream API contains the real secret — but any error echoed back to the agent is intercepted by the MITM proxy. The agent only ever held the masked token, so even if it logs the error, no real credential is exposed.

Additionally, KeyZero's connection control in .keyzero.toml restricts which hosts the agent can contact, limiting the attack surface:

[blind]
[[blind.connections]]
allow = true
host = "api.openai.com"

[[blind.connections]]
allow = false
host = "*"

The Zero-Trust Principle for AI Agents

These five vectors share a common root cause: the agent holds a secret it does not need to hold. Traditional secret management assumes the consumer is trusted — it fetches a credential and uses it directly. AI agents break this assumption because their internal state is observable, loggable, and transmittable in ways that server processes are not.

The zero-trust principle for AI agents is straightforward: an agent should never hold a secret it does not need. If the agent needs to call an API, it should not hold the API key — it should hold a reference to a system that will make the authenticated call on its behalf.

KeyZero implements this principle through two mechanisms:

  • Blind mode (kz run --blind): The agent process receives masked tokens; a local MITM proxy swaps them for real values at the network edge
  • MCP fetch tool: The agent requests an authenticated HTTP call by resource reference; KeyZero resolves the credential and makes the call, returning only the response

Both approaches ensure that secrets never enter the agent's memory, context window, logs, or tool call parameters. The attack surface for secret leakage drops to zero because there is nothing to leak. For a comprehensive set of deployment strategies that build on these principles, see five patterns for secret-safe AI deployments.