Learn consistent AI image generation with JSON prompting. Lock identity and style, vary scenes without drift, and build stable positive and negative prompt sets for reliable, repeatable visuals.
You asked for deeper, better-crafted prompts. This guide upgrades the earlier JSON approach so you can lock a character or style and produce repeatable images across many scenes. We’ll use anchors (details that must stay fixed) and unlocks (details you’re allowed to change). JSON prompting means you describe images as structured fields instead of a chatty paragraph, which reduces drift and makes changes predictable.
Two terms upfront. An anchor is a specific attribute you keep constant (hair shape, a beauty mark, a lens). A delta is a tiny change you apply (pose, action, background) without touching anchors. With a clean JSON schema, you can generate new shots by swapping deltas while the character and style stay the same.
Think in four layers. Your Style Bible defines look and feel; your Character Sheet pins identity; your Shot Spec describes the scene; your Builder converts JSON to the exact text prompt your image model expects. You’ll reuse the first two and vary the third; the builder stays identical.
💡 Insight: Consistency improves when you lock a few powerful knobs (face markers, hair silhouette, palette, lens) and vary only one or two scene variables per shot.
We’ll craft prompts you can paste into a chat model to produce validated JSON, merge character and shot safely, and then emit a clean, ordered text prompt for any image engine.
Start by forcing strictly formatted output. This system prompt keeps responses short, minified, and checkable.
You are an Image Prompt JSON Compiler. Output ONLY minified JSON that matches this schema and rules. No prose. SCHEMA: { "subject": "string", // short name "identity": { // anchors that must not drift "face_markers": "string", // 8–18 words, unique anatomy markers "hair": "string", // silhouette-level description "wardrobe_core": "string", // stable outfit baseline "signature_props": ["string"] // 1–3 stable items }, "style_lock": { "palette": "string", // 3–6 color phrases "rendering": "string", // medium: photo, 3D, ink, painterly "lens": "string", // focal length or lens type "lighting": "string", // primary light + mood "aspect_ratio": "string" // e.g., 4:5, 1:1, 16:9 }, "shot": { "scene": "string", // place + time "pose": "string", // body/head orientation "action": "string", // what the subject is doing "framing": "string", // close-up, waist-up, full-body "background": "string" // environment cues, short }, "negatives": ["string"], // artifacts to avoid "controls": { "seed": "number|string", // if your tool supports it "ref_image_ids": ["string"], // optional references "strengths": { "identity": 1.0, "style": 1.0, "scene": 1.0 } // 0–1 hints }, "self_check": { "contradictions": ["string"], // list any conflicting phrases "drift_risk": "low|medium|high", // based on unlocked fields "confidence_1_5": 1 // how sure the compiler is } } RULES: - Keep values concise and concrete; avoid synonyms. - Minify JSON (no newlines, no comments). - If unsure, include the field with an empty string; still return valid JSON.
This prompt creates a reusable style foundation that keeps shots coherent.
Produce a Style Bible JSON for a warm, modern café aesthetic used for character portraits. Lock a distinctive, repeatable look. Constraints: - palette: name 3–6 colors with short qualifiers (e.g., “coffee brown,” “teal accent”). - rendering: pick one (photographic realism | clean 3D render | ink line art | painterly oil). - lens: pick one focal length with a plain description (e.g., “50mm prime”). - lighting: describe primary light direction and quality (e.g., “soft window key, subtle rim”). - aspect_ratio: choose one, e.g., “4:5”. Return only the "style_lock" object matching the Compiler schema. Minify.
Expected flavor (minified):
{"palette":"coffee browns, cream highlights, teal accents","rendering":"photographic realism","lens":"50mm prime","lighting":"soft window key, gentle rim","aspect_ratio":"4:5"}
Anchor the person with a few unique, stable features. Keep words economical and testable.
Create a Character Sheet JSON for "Barista Lina" who appears the same across images. Requirements: - identity.face_markers: anatomy-level markers (e.g., “almond eyes,” “beauty mark below left eye,” “freckled nose bridge”). 8–18 words. - identity.hair: silhouette-first description (e.g., “short black bob, blunt bangs”). - identity.wardrobe_core: stable outfit baseline (e.g., “teal apron over cream shirt”). - identity.signature_props: 1–2 small, always-on items (e.g., “silver hoop earrings”). Also include negatives for common errors: “extra fingers”, “blurry text”, “double face”. Use controls.seed=142381. For style_lock, reuse this object: {"palette":"coffee browns, cream highlights, teal accents","rendering":"photographic realism","lens":"50mm prime","lighting":"soft window key, gentle rim","aspect_ratio":"4:5"} Return a single minified JSON matching the Compiler schema.
Now vary only the scene, pose, action, framing, background. Let the compiler’s schema guard the rest.
Using the existing Character Sheet, produce a new JSON with this shot: - scene: morning rush café counter - pose: three-quarter, head tilted slightly left - action: pouring latte art into a tulip pattern - framing: waist-up - background: espresso machine, ceramic cups, soft bokeh Do NOT alter identity, style_lock, negatives, or controls. Update only the "shot" object. Return merged minified JSON.
💡 Insight: Treat identity and style_lock as read-only. If you change them, you’re starting a new character or art direction.
Deltas help you make multiple images that still “read” as the same person.
Starting from the last JSON, apply this delta patch: - shot.action: wiping counter with a towel, smiling at camera - shot.background: window with city street reflections - shot.pose: frontal, shoulders squared Keep everything else identical. Return a fully merged JSON. If any required field was dropped, restore it.
When you combine a Character Sheet and Shot Spec, you can ask the model to self-check for contradictions and rate drift risk.
You are a JSON Merger + Drift Guard. Given A=Character Sheet JSON and B=Shot Spec JSON (both follow the Compiler schema), merge them with these rules: - A.identity and A.style_lock override B if there is any conflict. - Only B.shot fields are allowed to differ; all other fields must equal A. - Recompute self_check: - contradictions: list any conflicting phrases you detect (e.g., hair length mismatch). - drift_risk: low if only shot.* changed; medium if style_lock changed; high if identity fields changed. - confidence_1_5: integer. Return a single minified JSON.
Many image tools still take a text string. This builder turns JSON into a canonical prompt order that tends to reduce variance.
You are a Prompt Builder. Input: one JSON matching the Compiler schema. Output: TWO strings: - "positive": a single, comma-separated line in this exact order: subject; identity.face_markers; identity.hair; identity.wardrobe_core; signature_props; shot.scene; shot.pose; shot.action; shot.framing; shot.background; style_lock.rendering; style_lock.lens; style_lock.lighting; style_lock.palette; aspect ratio tag; optional reference tags. - "negative": join negatives with commas. Formatting rules: - Prefer nouns over adjectives. - Keep each clause short (3–6 words) and specific. - Append aspect ratio as “ar:4:5”. - If controls.seed exists, output "seed:142381" as a trailing token. - If controls.ref_image_ids exist, append "ref:ID" tokens. Return minified JSON: {"positive":"...","negative":"..."} only.
Expected flavor (snippet):
{"positive":"Barista Lina, almond eyes, beauty mark left, freckled nose bridge, short black bob, blunt bangs, teal apron over cream shirt, silver hoop earrings, morning rush café counter, three-quarter, head tilted left, pouring latte art tulip, waist-up, espresso machine and cups in bokeh, photographic realism, 50mm prime, soft window key gentle rim, coffee browns cream highlights teal accents, ar:4:5, seed:142381","negative":"extra fingers, blurry text, double face"}
Each template is introduced briefly, then given as a copy-paste block.
Template — One-shot Character + Style authoring (compact)
SYSTEM: (use the JSON Compiler system prompt above) USER: Create a consistent character for "Barista Lina" with a clean photographic style. Fill all fields. Keep text concrete and concise; prefer nouns. Set controls.seed=142381. Return one minified JSON.
Template — Three-shot scene pack from the same anchors
Produce THREE merged JSON objects for the same subject and style. Keep identity/style_lock/negatives/controls identical across all. Vary ONLY shot.* as specified: A) scene: café counter, pose: three-quarter left, action: pouring latte art, framing: waist-up, background: espresso machine bokeh B) scene: sunny window bar, pose: frontal, action: wiping counter with towel, framing: mid-shot, background: street reflections C) scene: outdoor sidewalk table, pose: seated profile right, action: handing takeaway cup, framing: three-quarter, background: soft pedestrians Return an array [..] of minified JSON objects. Include updated self_check for each.
Template — Tighten an existing sheet (repair contradictions)
Given this JSON, fix any contradictions and shorten verbose phrases without changing meaning. Keep anchors stable, reduce drift_risk to "low", and ensure confidence_1_5≥4. Return one minified JSON.
Template — Prompt Builder invocation
Convert this JSON to builder output. Return only {"positive":"...","negative":"..."}.
Template — Negative contract upgrade
Expand negatives with 5 precise artifact blockers relevant to portraits. Keep them general (e.g., "warped hands", "melted ears", "over-sharpening halos"). Return the original JSON with an updated negatives array.
When faces drift, it’s usually because identity fields are vague or overstuffed with style words. Rewrite face_markers with anatomical nouns (“almond eyes,” “beauty mark below left eye”) and remove mood adjectives. If the look drifts, move all look words into style_lock and stop repeating them in the scene. If the engine ignores seed, lean on reference IDs or keep lens, lighting, and palette fixed. If outputs stop feeling fresh, vary pose and background first; only later vary action.
⚠️ Pitfall: Conflicting hair descriptions (“short bob” vs “shoulder-length”) will overpower seeds. Keep one canonical value per anchor.
You will create a small mascot and generate three consistent shots, then build the text prompts.
Task. Define “Pixel the Robot Tutor” with identity anchors and a clean 3D style. Then produce shots A/B/C. Finally, run the Prompt Builder.
Author a Style Bible for clean educational 3D renders (cool gray base, teal accents, tiny orange highlight, 35mm lens, soft studio key + rim, 16:9).
Create the Character Sheet with face markers (“round head, teal LED eyes, orange antenna tip”), a matte plastic finish, and a simple torso panel.
Generate three shots:
A: classroom whiteboard, pointing at diagram, waist-up.
B: library desk, reading a book, three-quarter.
C: outdoor campus bench, waving, mid-shot.
Use the Prompt Builder to emit positive/negative strings for each.
Expected builder output (snippet for A):
{"positive":"Pixel the Robot Tutor, round head teal LED eyes orange antenna tip, matte plastic finish, simple torso panel, classroom whiteboard, three-quarter right, pointing at diagram, waist-up, faint diagrams in background, clean 3D render, 35mm, soft studio key with rim, cool gray with teal accents orange highlight, ar:16:9, seed:884211","negative":"text artifacts, logos, duplicate face, warped hands, melted ears"}
You now have a set of upgraded prompts that make image generation consistent by design. The Style Bible defines the look, the Character Sheet anchors identity, the Shot Spec varies scenes without touching anchors, and the Prompt Builder turns JSON into a stable, engine-friendly string. The self-check fields help you spot contradictions early and keep drift low.
The trade-off is flexibility. Heavy locks produce sameness; light locks invite drift. Start tight—face markers, hair silhouette, palette, lens, lighting—then relax one variable at a time. Keep language short and concrete. When in doubt, move descriptive style words into style_lock and keep shot purely about composition and action.
Next steps
Save your best Style Bible and Character Sheet as reusable files; generate a five-shot pack by changing only background each time.
Add a quick “repair” pass that lowers drift_risk to low before you build prompts.
Wrap the Prompt Builder in a tiny script so you can click-generate consistent positive/negative strings for any JSON you author.
Follow guided learning paths from beginner to advanced. Master prompt engineering step by step.
Explore PathsReady to Master More? Explore our comprehensive guides and take your prompt engineering skills to the next level.