How to write AI image prompts

A good AI image prompt names a clear subject, then layers on the style or medium, the composition, the lighting, the color and mood, and the details that make it specific — plus an aspect ratio and a negative prompt for what to leave out. The more deliberate each layer, the closer the result is to what you pictured.

The anatomy of an image prompt

Eight layers that turn a vague idea into a precise picture.

Subject

What the image is of

Name the main subject concretely — who or what, doing what, where. This anchors everything else.

Style / medium

How it should look

Photograph, oil painting, 3D render, watercolor, line art — and any artist or era reference that fits.

Composition / shot

Framing and angle

Close-up, wide shot, overhead, rule of thirds, portrait or landscape orientation.

Lighting

The light source and quality

Golden hour, soft studio light, harsh noon sun, neon glow, rim lighting — lighting sets the whole mood.

Color / mood

Palette and feeling

Warm earth tones, muted pastels, high-contrast monochrome; calm, dramatic, nostalgic.

Details

The specifics that sell it

Textures, materials, background elements, depth of field, lens (e.g. 85mm), camera, render quality.

Aspect ratio

The output shape

Square, 16:9, 9:16, 3:2 — match it to where the image will be used (banner, story, print).

Negative prompt

What to exclude

List things to keep out — extra fingers, text, watermark, blur — so the model steers away from them.

How the models differ

The same idea needs a different format depending on where you run it.

Midjourney

Comma-separated descriptors plus flags. Add --ar 16:9 for aspect ratio, --v 6 for version, and --no text to exclude things. Stacks short, punchy phrases well.

DALL·E 3

Wants a full, natural-language description of the scene — write a sentence or two as if briefing an illustrator. It ignores Midjourney-style flags, so describe aspect and exclusions in words.

Stable Diffusion

Tag-style prompts (keywords and quality boosters) with a separate Negative prompt field for what to avoid. Weighting and exact phrasing matter; community tags are common.

Flux

Prefers detailed, natural-language prompts and handles long, descriptive sentences well. Strong with coherent scenes and realistic detail — describe rather than tag.

Gemini

Conversational, natural-language descriptions work best, and you can refine across turns. Describe the scene fully; exclusions go in plain words rather than flags.

Ideogram

Best-in-class at rendering legible text inside images. Put the exact words you want shown in quotes and describe the layout — ideal for logos, posters, and signage.

Before and after

Weak prompt

“a dog”

No subject detail, style, lighting, composition, or aspect ratio → a random, generic image.

Fully specified prompt

“A golden retriever puppy sitting in tall summer grass, photograph, 85mm lens, shallow depth of field, warm golden-hour backlight, soft bokeh, shot from a low angle, 3:2 aspect ratio. Negative: text, watermark, blur.”

Subject + style + lighting + composition + aspect ratio + exclusions → a focused, intentional image.

Let Promptivo build the image prompt for you

Promptivo has a guided image-prompt builder: describe what you want once, and it assembles every layer — then formats it for the model you’re using, whether that’s Midjourney, DALL·E, or Stable Diffusion.

Build an image prompt →Free AI prompt generator

Want image-only? Try the free image prompt generator.

Questions, answered

What makes a good AI image prompt?

A good image prompt names the subject, then layers on style or medium, composition, lighting, color and mood, fine details, and aspect ratio — plus a negative prompt for what to exclude. The more specific each layer, the closer the result matches what you pictured.

What is a negative prompt?

A negative prompt lists things you do not want in the image — extra fingers, text, watermarks, blur, clutter. Stable Diffusion has a dedicated negative field; Midjourney uses the --no flag; DALL·E, Flux, and Gemini take exclusions in plain language.

How is a Midjourney prompt different from a DALL·E prompt?

Midjourney rewards comma-separated descriptors and flags like --ar 16:9 --v 6 and --no. DALL·E 3 ignores flags and wants a full natural-language description of the scene, with aspect ratio and exclusions written out as words.

Do I need different prompts for each image model?

The same core ideas — subject, style, lighting, composition — carry across models, but the format differs: tags and flags for Midjourney and Stable Diffusion, flowing description for DALL·E, Flux, and Gemini. Promptivo adapts one brief into the right format for each.