Who should read this Beginner level guide?

This guide is perfect for Beginner level practitioners looking to improve their prompt engineering skills in prompt engineering, prompting techniques.

How long does it take to complete this guide?

This guide takes approximately 8 min read to read and understand.

What topics does this guide cover?

This guide covers: prompt engineering, prompting techniques.

Back to Guides/Guide

Structured outputs without tears

Learn to make LLMs return schema-matching JSON. Write a minimal JSON Schema, constrain outputs with a system prompt, and auto-validate every response with a repair loop. Includes a hands-on lab to build, test, and confirm valid outputs.

September 4, 2025

8 min read

Promptise Team

Beginner

prompt engineeringprompting techniques

If you’ve ever asked a model for JSON and got back a half-valid blob wrapped in prose, this guide is for you. We’ll make the model produce JSON that actually matches a schema and show how to validate the response automatically. You’ll leave with a reusable system prompt, a tiny schema, and a short validation script you can run.

Structured output means the model returns data in a predictable shape—fields, types, and constraints—not free text. A schema is a compact contract that defines that shape. A validator is a small program that checks whether the JSON follows the contract and tells you exactly what’s wrong if it doesn’t.

Why this matters now: as soon as you rely on LLM output in a workflow—rendering a UI, writing to a database, or triggering another tool—guesswork becomes risk. Schema-checked JSON turns “pretty good text” into “safe, machine-ready data.”

Mental model: “Contract → Constrain → Check”

Think of three actors working together:

Contract (Schema). You define the exact fields and types you want. Strings vs integers. Allowed values. What’s required, what’s optional.

Constrain (Prompt). You tell the model to output only JSON and to follow the schema. You forbid extra prose and explain how to handle uncertainty (e.g., use null).

Check (Validation). You parse the model’s JSON and run a validator. If it fails, you either ask the model to fix it or you reject the result safely.

Example goal: Extract movie metadata.

Contract: title (string), year (integer 1878–2030), genres (array of enum), family_friendly (boolean), content_warnings (array of strings), confidence (number 0–1).
Constrain: “Respond with JSON only, no markdown, no comments. If unknown, use null. Keys in this order.”
Check: Run a validator; if it fails, send the validator’s error back to the model: “Fix to match schema.”

💡 Insight: Models happily comply when you specify how to handle unknowns. “Use null when unsure” prevents hallucinated values and keeps validation green.

Walkthrough: From vague to reliable

Start with the problem statement: “Read a short blurb and return consistent movie metadata.” Without structure, you might try: “Summarize this movie.” You’ll get prose and inconsistent fields.

Now tighten it:

Write the schema. Keep it tiny. Fewer fields mean fewer errors.
Write the system prompt. This is the “policy”: style, format, and strict output rules.
Write the user prompt. This is the “task”: the specific input to fill the schema.
Validate the result. If invalid, show the error and ask for a corrected JSON.

⚠️ Pitfall: Mixing policy and task in one big prompt often leads to drift. Keep the “only JSON” rules in the system prompt and the specific request in the user message.

Practical: Copy-paste scaffolds

Below is a starter system prompt you can reuse for any structured-output task.

Starter system prompt (policy)

json

You are a careful data formatter. Always return ONLY raw JSON that matches the provided JSON Schema exactly. Rules: - Do not include markdown fences, explanations, comments, or trailing text. - Use null for unknown/unsure fields. - Keep keys in the order they appear in the schema. - Do not invent fields or values outside the schema. - Arrays must be JSON arrays, not comma-separated strings. If the instruction conflicts with the schema, the schema wins.

JSON Schema (contract) — movie metadata

json

{ "$schema": "https://json-schema.org/draft/2020-12/schema", "title": "MovieMetadata", "type": "object", "additionalProperties": false, "required": ["title", "year", "genres", "family_friendly", "content_warnings", "confidence"], "properties": { "title": { "type": "string", "minLength": 1 }, "year": { "type": "integer", "minimum": 1878, "maximum": 2030 }, "genres": { "type": "array", "items": { "enum": ["action", "comedy", "drama", "thriller", "scifi", "romance", "family", "animation"] }, "minItems": 1 }, "family_friendly": { "type": "boolean" }, "content_warnings": { "type": "array", "items": { "type": "string" } }, "confidence": { "type": "number", "minimum": 0, "maximum": 1 } } }

User prompt (task) — vague vs precise

Vague:

Read the paragraph and give me the movie info as JSON.

Precise:

Fill the "MovieMetadata" schema for the text below. Remember: JSON only, no markdown or comments. Use null if unsure.

TEXT:
A 1995 family adventure about a boy and his dog crossing the Rockies. Heartwarming, rated PG, with a brief peril scene.

Expected good JSON (shape, not truth)

{
"title": null,
"year": 1995,
"genres": ["family", "drama"],
"family_friendly": true,
"content_warnings": ["peril"],
"confidence": 0.7
}

💡 Insight: When you want the model to permit null, put it in the rules and the schema (no required? you still want the key present with null), or keep it required and instruct the model to set null when unknown.

Automatic validation: minimal scripts

You can validate in any language. Here are two tiny options.

Python (with jsonschema)

json

# pip install jsonschema import json, sys from jsonschema import validate, Draft202012Validator schema = json.loads(open("schema.json").read()) candidate = json.loads(open("model_output.json").read()) v = Draft202012Validator(schema) errors = sorted(v.iter_errors(candidate), key=lambda e: e.path) if errors: print("INVALID") for e in errors: path = ".".join([str(p) for p in e.path]) or "(root)" print(f"- {path}: {e.message}") sys.exit(1) print("VALID")

Node.js (with ajv)

json

// npm i ajv const fs = require("fs"); const Ajv = require("ajv"); const ajv = new Ajv({ allErrors: true, strict: true }); const schema = JSON.parse(fs.readFileSync("schema.json", "utf8")); const data = JSON.parse(fs.readFileSync("model_output.json", "utf8")); const validate = ajv.compile(schema); const valid = validate(data); if (!valid) { console.log("INVALID"); for (const err of validate.errors) console.log(`- ${err.instancePath || "(root)"} ${err.message}`); process.exit(1); } console.log("VALID");

Repair loop (prompt to fix invalid JSON)

json

Your previous JSON did not validate. Here are the errors: {{VALIDATOR_ERRORS}} Return a corrected JSON that satisfies the original schema and rules. JSON only.

Troubleshooting & trade-offs

The model returns prose around JSON. This is almost always a prompt issue. Strengthen the system prompt: “ONLY raw JSON, no markdown fences, no explanations.” If you must allow fences for chat UI readability, strip them before validation.

Types drift (e.g., "year": "1995"). Remind the model that numbers must be numbers and include one tiny exemplar. If drift persists, feed validator errors back verbatim and ask for a corrected JSON.

Enums and booleans are flaky under ambiguity. Prefer small, explicit enums and include the “use null when unsure” rule. Over time, monitor which fields fail and adjust the schema or upstream instructions.

Large outputs hit token limits. In beginners’ setups, start with small schemas and page results (ask for arrays in chunks). For production, consider a structured-output API feature if your provider offers it.

Mini exercise / lab

Your task: extract support ticket triage data from a short message using schema+validation.

Schema

json

{ "$schema": "https://json-schema.org/draft/2020-12/schema", "title": "TicketTriage", "type": "object", "additionalProperties": false, "required": ["summary", "priority", "category", "needs_handoff", "confidence"], "properties": { "summary": { "type": "string", "minLength": 5 }, "priority": { "enum": ["low", "medium", "high", "urgent"] }, "category": { "enum": ["billing", "access", "bug", "question"] }, "needs_handoff": { "type": "boolean" }, "confidence": { "type": "number", "minimum": 0, "maximum": 1 } } }

System prompt

You are a careful data formatter. Return ONLY raw JSON that matches the TicketTriage schema. No prose, no markdown. Use null if unsure.

User prompt

Fill the TicketTriage schema for this message:

“Hi, I can’t log into my account since yesterday. The password reset link says it expired. Please help quickly!”

Expected output (one possible valid answer)

{
"summary": "User cannot log in; password reset link expired.",
"priority": "high",
"category": "access",
"needs_handoff": true,
"confidence": 0.8
}

Now run your validator. If it’s invalid, feed the errors back to the model and request a corrected JSON. Confirm you reach “VALID.”

Summary & Conclusion

You learned the simple flow for reliable structured outputs: write a minimal schema, constrain the model with a strict system prompt, and validate every response. This turns creative language output into dependable data you can ship and automate around.

Common pitfalls—like extra prose, type drift, or enum confusion—usually vanish when you separate policy (system) from task (user) and tell the model exactly how to handle unknowns. When things still go wrong, the validator’s errors are your best debugging tool—show them to the model and ask for a fix.

Structured outputs are not about fancy tricks; they’re about clarity and contracts. Keep schemas small, prompts explicit, and validation non-negotiable. That’s how you get JSON without tears.

Next steps

Swap in your own schema (e.g., product specs or blog metadata) and run the same loop.
Add a repair loop that automatically retries once with validator errors before failing.
Track validation failures over a week and refine your schema or prompts based on patterns.

Learning Paths

Structured Learning

Follow guided learning paths from beginner to advanced. Master prompt engineering step by step.

Explore Paths

Continue Your Learning Journey

Ready to Master More? Explore our comprehensive guides and take your prompt engineering skills to the next level.

Explore More Guides Browse Learning Paths

Structured outputs without tears

September 4, 2025

8 min read

Promptise Team

Beginner

prompt engineeringprompting techniques