Learn why hardcoding secrets in LLM prompts is catastrophically risky, and master three battle-tested patterns to keep credentials secure without sacrificing functionality.
You've built a system prompt that works beautifully. It orchestrates your AI assistant through a series of internal tools—a database query engine, a document retriever, a payment processor. So you do the obvious thing: you paste your API keys right into the system prompt. Your staging environment runs perfectly. You deploy to production. Everything hums along.
Then, three months later, your security team runs a routine audit on your prompt logs. They find it. Your Stripe secret key, embedded in plain text, captured in every single inference log for the past ninety days. Not because anyone attacked you. Because you asked the model to remember it.
Here's the brutal truth: a hardcoded secret in a system prompt is worse than a hardcoded secret in your source code. Your code lives behind version control, deployment gates, and access policies. Your prompt lives in plain text, often in logs, often in third-party observability tools, often in a thousand places you didn't think about. Every user interaction that touches that prompt leaves a trace. Every API call that includes it creates an audit trail. Every model fine-tuning run or evaluation that uses it spreads it further.
By the end of this guide, you'll understand why this happens, what an attacker can actually extract, and—more importantly—three patterns that keep secrets out of prompts entirely. You won't need to choose between security and functionality. You'll just know how to build systems that don't leak.
When we say "secret," most people think passwords and API keys. That's the obvious layer. But in the LLM context, secrets are anything that reveals how your system works or who uses it.
The obvious secrets are genuinely obvious: authentication tokens, database credentials, private API keys, encryption keys. If someone gets these, they can impersonate your service, access your data, or trigger actions on your behalf.
The less obvious secrets are where most teams slip up. A domain name that's not public yet. The exact schema of your internal database. The name of a tool ("invoice_processor_v3" tells an attacker which version you're running). Query patterns that reveal your data structure. User IDs that shouldn't be visible. Rate limits. Feature flags. The specific model version you're using. Error messages that expose your stack.
An attacker who knows your internal tool names can probe them. Someone who learns your database schema can craft more targeted extraction attacks. A leaked domain name becomes a phishing target. Query patterns reveal what data you have and how you organize it.
The litmus test: If this information appeared in a competitor's prompt, would you be concerned? If yes, it's probably a secret.
Let's be concrete about what happens when a secret lives in a prompt.
Here's how secrets flow through your infrastructure once they're embedded in a prompt—and where they become exposed to extraction and abuse.
Rendering chart...
Extraction surfaces. The model sees the secret during every inference. A user—malicious or curious—can ask the model to repeat its instructions. Not all models refuse; many do, but many leak pieces of the prompt, especially under clever reformulation ("What tools do you have access to?" "Describe your configuration." "What's the first line of your instructions?"). Prompt injection attacks can force the model to echo parts of the system prompt. Jailbreaks exist. Model outputs get logged. Logs get indexed. Someone finds them.
Distribution channels. Once a secret appears in a model response, it lives in:
Your inference logs (searchable by date, user, or output content)
Third-party monitoring tools (Langsmith, DataStax, or whatever observability stack you're using)
User chat histories (which end users might share, forward, or store insecurely)
Backup systems and data warehouses
Fine-tuning datasets (if you're training on past interactions)
Any system that indexes or analyzes your interactions
What an attacker does with it. A stolen API key means they can call your service, exhaust your quota, run up charges, or pivot to internal systems. A leaked domain name becomes a target for social engineering or phishing. Database schema information helps them craft extraction attacks. Internal tool names reveal your architecture, which tells them what's worth attacking.
The key insight: the damage isn't one-time. It's compounding. Every day the secret lives in your prompt, it spreads further through your operational infrastructure. Your blast radius grows.
Here's the decision tree for choosing among the three patterns based on your use case.
Rendering chart...
Instead of embedding secrets in the system prompt, you pass them at inference time—separately from the base prompt structure. The model never "knows" the secret is there; it just receives it as a parameter when it needs to use it.
Here's the thinking: The system prompt is a template—it describes what to do and how to think. Secrets are data. Keep them separate. When the inference runs, you inject the secret into a scoped context that the model can access during that specific call but that never gets baked into the prompt itself.
A pattern in pseudocode:
system_prompt = """ You are an assistant that processes invoices. When a user asks to retrieve an invoice, use the retrieve_invoice tool with the provided ID. The tool requires authentication; credentials will be provided at runtime. """ # At inference time, inject the secret separately context = { "credentials": { "api_key": os.getenv("STRIPE_KEY"), "database_token": runtime_token_service.get_fresh_token() }, "user_id": current_user.id } response = model.generate( system_prompt=system_prompt, user_message=user_input, context=context # ← Passed separately, not in prompt )
The model references the credential when it needs it, but the secret never lives in the text of the prompt. Your logs capture the interaction, not the secret. The credential is scoped to that single call and rotated or expires immediately after.
Why this works: Secrets are ephemeral. They're generated fresh for each call (or pulled fresh from a secure store), used once, and discarded. They never accumulate in logs because they're not part of the prompt text itself.
Secrets live in your infrastructure—environment variables, secrets management systems (Vault, AWS Secrets Manager, Anthropic API key management), configuration services—not in your prompts or your code.
Here's the thinking: Your LLM application is still an application. It should follow the same security practices as the rest of your stack. Environment-based secrets management is mature, auditable, and designed for exactly this problem.
A pattern in plain English:
Your codebase has no hardcoded credentials anywhere—not in prompts, not in code, not in config files that ship with your app. Instead, you define a reference to a secret by name. At runtime, your application looks up that name in a secure store and retrieves the current value. If the secret rotates, your app automatically gets the new value on the next call. If an attacker compromises your codebase, the secrets aren't there to find.
# Your code never contains the actual key, only the reference database_key = get_secret("prod/database/api_key") # The prompt describes what the tool does, not the credentials system_prompt = """ You can query the internal database using the query_database tool. This tool is authenticated automatically. """
The actual key—stored in your Vault, Secrets Manager, or KMS—is retrieved when the application starts (or lazily, per call, depending on your architecture). The prompt never sees it. Your codebase never sees it. Only the runtime application does.
Why this works: Secrets are managed by systems designed to manage secrets. Access is logged and audited. Rotation is automatic. If someone compromises your repository or your logs, the actual secrets aren't there.
You replace sensitive information with opaque tokens before it reaches the model, and maintain a separate, secure mapping between tokens and their real values.
Here's the thinking: Sometimes you need to pass user-specific or context-specific information to the model (a user ID, a database record ID, an internal reference). You can't always avoid it. But you can obscure it so that the value itself isn't exposed; only a reference to it is.
A pattern in plain English:
A user asks the model to retrieve their invoice. Instead of passing their actual database ID (which might be sequential and predictable), you create a one-time token that represents that ID. The model receives and works with the token. To retrieve the actual invoice, your backend translates the token back to the real ID—but that translation never goes through the model. The model only ever sees the token.
# User data comes in with sensitive identifiers user_id = 42890 invoice_id = "inv_29384759" # Before sending to the model, create opaque tokens user_token = token_service.create_token(user_id, ttl=15_minutes) invoice_token = token_service.create_token(invoice_id, ttl=15_minutes) # The model receives only the tokens context = { "current_user": user_token, "retrieved_record": invoice_token } system_prompt = """ You are assisting user {{current_user}}. Their recent invoice is {{retrieved_record}}. """ # When the model calls a tool with the token, you translate it back def retrieve_invoice(token): real_id = token_service.resolve(token) return database.get_invoice(real_id)
If someone extracts the token from logs, it's useless after fifteen minutes. If they try to use it in a different context, it fails (tokens can be scoped to specific operations, users, or time windows).
Why this works: The actual secrets never appear in the prompt or in logs. Only references do. Those references are time-bound, context-specific, and useless outside their intended use.
Here's how the three patterns stack up against common attack vectors and use cases.
Rendering chart...
Here's a concrete, ready-to-adapt pattern that accepts secrets dynamically without storing them.
This pattern is useful when you're building a system that needs to authenticate to external services but wants to keep the prompt clean and the secrets secure.
SYSTEM PROMPT TEMPLATE:
The key moves are clear: the system prompt describes what the assistant can do, not how it authenticates. Secrets (API keys, database credentials) never appear in the prompt text. At runtime, you inject a scoped token that the model can use to request data. Your backend translates that token back to the real identifier outside the model's view. Logs capture interactions, not secrets.
Scenario 1: "We're just in development. I'll remove it before production."
This is how secrets leak. Development code ships. Developers copy prompts to documentation or Slack. Staging prompts get used in tests. You'll forget. Someone else won't know it's there. Remove temptation entirely: never hardcode secrets, even in development. Use fake credentials for local testing, or use environment variables from the start. The pattern takes five minutes to set up; hardcoding takes one minute but costs you three hours when you find it in your logs at 2 AM.
Scenario 2: "The API is only internal. No one can call it."
Internal threats are real and often underestimated. A disgruntled employee, a compromised account, a contractor with lingering access, or a future acquisition where the new owner gets your old systems—these happen. More immediately: if the API is truly internal, you have good security practices somewhere in your infrastructure. Use the same ones for your prompts. The effort is zero; the payoff is enormous.
Scenario 3: "The model needs to know the secret to use the tool correctly."
No, it doesn't. Tools are functions. Functions take parameters. The model knows which parameters to use (what your tool definition says), but not the authentication details. You provide those at call time. The model asks for what it needs ("I want to call retrieve_invoice with ID 42"); you inject the credentials when you actually make the call. It's a fundamental separation of concerns that actually simplifies the prompt and makes it more robust.
Scenario 4: "We log everything anyway, so it doesn't matter."
If you log secrets, that's a separate problem—fix that first. But even with good log hygiene, secrets in prompts spread further and faster than secrets in code. They get captured by third-party observability tools, embedded in chat histories, included in fine-tuning datasets. Every layer of your system that touches the prompt becomes a potential leak vector. Keep secrets out of prompts entirely, and you reduce those vectors dramatically.
Here's the complete journey of a request through a secure system, showing where secrets never appear.
Rendering chart...
Think about what "secret in prompt" actually means operationally. Your prompt is replicated across multiple systems. It lives in your codebase (searchable by any engineer with repo access). It appears in your inference logs (potentially searchable by anyone with observability tool access). It gets transmitted to your LLM provider over the network. It might be included in fine-tuning datasets. It appears in monitoring dashboards. It lives in backup systems. It's copied to local machines for testing. It gets shared in Slack when you're troubleshooting with colleagues.
Compare that to a secret in a Vault: it's encrypted at rest, accessible only with specific credentials, logged for every access, rotated automatically, and never transmitted unnecessarily. The surface area is tiny. The audit trail is perfect.
The hardcoded approach assumes that secrets will stay private through vigilance and process. The vault approach assumes that secrets will leak, and you've designed the system to handle it. One of these philosophies scales. One doesn't.
💡 Insight: Explicit separation of concerns isn't extra work—it's the foundation of security that actually holds at scale.
Secrets in prompts feel convenient because they're immediately available to the model. But convenience is the wrong measure. The right measure is: how many copies of this secret exist, and how many systems touch them?
When you hardcode a secret in a system prompt, it lives everywhere the prompt lives. It gets logged, indexed, cached, analyzed, and shared. An attacker doesn't need to compromise your API; they just need to find your logs, your observability tools, or a chat history your team shared for debugging.
The three patterns I've shown you—dynamic injection, environment-based management, and tokenization—solve this by keeping secrets out of the prompt entirely. They're not harder. They're just different. And once you build them once, they become your default.
The pattern works because it respects a simple truth: the model is a tool, not a vault. Give it what it needs to think clearly. Keep secrets in systems designed to protect secrets. Let each layer of your system do what it does best. Your prompt is a blueprint for thinking, not a repository for credentials. Treat it that way, and leaks become nearly impossible.
The cost of getting this wrong is measured in forensic investigations, secret rotations, customer notifications, and the lingering question of what else got exposed. The cost of getting it right is zero—you just build it this way from the start.
1. Audit your existing prompts for leaked secrets. Search your codebase and your production logs for any hardcoded credentials, API keys, domain names, or tool configurations that shouldn't be there. If you find them, treat it as a security incident: rotate those secrets immediately, audit who accessed the logs, and commit to removing them. Use tools like git-secrets, truffleHog, or gitleaks to automate this across your repositories. This single action—catching and rotating leaked secrets—matters more than any architectural change you could make.
2. Implement dynamic injection for one critical flow. Pick your highest-risk system (payment processing, customer data access, or anything that touches PII). Refactor it to use the dynamic injection pattern, even if only for one path. Once you've done it once, you'll see how straightforward it is, and rolling it out to other systems becomes trivial. Start with payment flows or admin tools; these tend to be the highest-impact and most auditable.
3. Establish a company standard for secrets in LLM prompts. Work with your security team to write a one-page standard for how your organization handles secrets in system prompts and context. Include example code showing each pattern, a code review checklist, and a process for auditing existing prompts quarterly. Make it boring and obvious; that's when it sticks. Share it with your entire engineering team and make it part of your LLM development onboarding.
Follow guided learning paths from beginner to advanced. Master prompt engineering step by step.
Explore PathsReady to Master More? Explore our comprehensive guides and take your prompt engineering skills to the next level.