
Generating an image is easy; using it safely in a paid campaign is the hard part. This episode teaches the three creative jobs, how to prompt image models, and the legal and human-review steps that keep AI art off your brand's liability list.
This is an Act I tutorial on AI image and creative generation for marketing: the single asset, end to end, from brief to export. We split the work into three jobs and match a tool category to each, then spend most of the time on the part that actually bites you, commercial safety.
The three jobs. Fast on-brand graphics live in hosted design tools like Canva Magic Studio and Adobe Express. Original hero art comes from image models: Midjourney (Version 8.1), OpenAI's GPT Image 1.5, Adobe Firefly, Ideogram, Google Imagen 3, and Flux from Black Forest Labs. Ad creative with legible baked-in text is its own job, best done by generating a clean hero then adding type in Photoshop or Canva.
Prompting. Use one structure: subject, medium, style, lighting, framing, mood, palette. Choose aspect ratio before you generate, and leave negative space for copy. Reference images and tools like Firefly Custom Models or Midjourney's character reference handle brand and character consistency.
Commercial safety, the heart of it. Pure AI images aren't copyrightable; Thaler v. Perlmutter is now settled law. Firefly is the only major tool with transparent training plus indemnity. Watch trademark and celebrity likeness, follow FTC disclosure plus New York's new synthetic-performer law, and understand C2PA content credentials and the EU AI Act. Lawsuits to track: Getty v. Stability AI and Andersen v. Stability AI (trial September 2026).
The slop trap. Visual tells read as cheap. Fix them with a ten-to-thirty-minute human-in-the-loop edit pass, the same discipline we teach for copy.
Plus the copyable workflow from brief to export sizes, and a light cost touch. News up top: Cannes Lions, Canva AI 2.0, Google AI Max, and the OpenAI Partner Network.
A quick tour of the week in AI and marketing, June twentieth through twenty-fourth, twenty twenty-six.
The big one is Cannes Lions, the Festival of Creativity, which opened June twenty-second and runs through the twenty-sixth. Artificial intelligence is the dominant theme, but the message from the jury leaders is a reset. Using AI on its own is no longer impressive. They're judging by measurable outcomes and authenticity instead, and they've added a new AI Craft subcategory and a Creative Brand Lion award. The takeaway for you is simple. Novelty is now table stakes. What wins is the quality of the creative, the attribution back to results, and whether the work feels authentically like your brand. There's a session today at eleven thirty called Winning the AI Discovery Era, featuring chief marketing officers from Google and JPMorganChase alongside OpenAI's chief revenue officer. Worth catching the recording.
Next, Canva shipped Canva AI two point zero this month. It turns the platform into a conversational creative workspace, where you describe what you want and it builds editable designs, not flat images you can't touch. It adds connectors, scheduling, web research, and brand intelligence, all powered by a new model they call the Canva Design Model. Canva reports that ninety-seven percent of marketing leaders already use AI in daily creative work, so the pitch is fewer hand-offs between tools. Test it on one campaign and measure the time you actually save.
On the advertising side, Google is rolling out asset experiments for Performance Max, so you can compare creative variants and measure video impact directly. More important for planning, dynamic search ads will auto-upgrade to Google's AI Max for Search starting February twenty twenty-seven. Traditional dynamic search is sunsetting. Start shifting those budgets to Performance Max now and run the asset experiments before the cutoff.
And one structural shift. OpenAI launched its Partner Network around June nineteenth, backed by a hundred and fifty million dollars, with a goal of certifying three hundred thousand consultants by the end of the year. Read that as OpenAI moving from competing on model power to competing on implementation. If you use an agency or consultant, it's worth checking whether they sit in OpenAI's partner tiers.
A standing note on AI search visibility. AI Overviews now reportedly trigger on roughly forty-eight percent of queries, and when your brand gets cited, organic click-through is reportedly thirty-five percent higher. But visibility is volatile. Only about thirty percent of brands stay visible from one answer to the next. Treat answer-engine presence as something you monitor, not something you set and forget.
Today we're doing images. The single asset, start to finish, the way we did it for copy. By the end you'll have a brief-to-export workflow you can run on your own brand this afternoon, and you'll know which steps protect you legally and which ones just make the picture look good. Generating an image is the easy part now. Using it safely in a paid campaign is the part that actually takes craft.
Let me start by splitting the work into three distinct jobs, because the single biggest mistake here is reaching for the wrong category of tool. Each job has its own tool family, and trying to force one tool to do all three is how you end up frustrated.
The first job is fast, on-brand graphics. This is your social card, your email header, your quick promo layout, the stuff you make a lot of and need to stay consistent. The tool family here is hosted design platforms. Canva Magic Studio is the obvious example, around fifteen dollars a month, and it bundles five things that work together: template generation, text to image, copywriting, a spreadsheet input for data, and chart generation. Canva claims users make five and a half times more branded content per week with it, and you treat vendor numbers like that as marketing, but the workflow is real. You write a brief, it generates layouts that match your brand theme. The standout feature is the one that reformats a design across platforms. You build a card once, and it redesigns the layout for Instagram, Stories, TikTok, LinkedIn, and email, redrawing the composition for each aspect ratio rather than just cropping it badly. Adobe Express is the other player in this category, free with premium tiers, with template generation, brand-aware generative fill, a rewrite-text feature for tone, and translation into forty-six languages.
The thing that makes hosted design tools genuinely useful for a marketer who owns a brand is the brand kit. You store your logos, fonts, colors, and approved templates in one place. Then the admin controls can restrict designs to your approved colors and fonts, and even require approval before anything publishes. In twenty twenty-six Canva extended this into a full brand system with a brand voice profile. That ties directly to the brand voice work we did in an earlier episode. The style guide and few-shot examples you built for copy have a visual cousin here, and the platform can enforce it for you. There are also small editing helpers worth knowing by name: a feature that turns part of an image into an editable element you can move and resize, a one-click background remover, a background generator, and an eraser that removes an object and heals the gap behind it. Canva's editing runs on Google's image-editing model under the hood, the one nicknamed Nano Banana, but you don't need to think about that. You just click.
Before we leave hosted design tools, one more reason they matter for the marketer who can't code. You never touch a setting that scares you. Everything happens in a familiar drag-and-drop canvas, the brand kit does the heavy lifting on consistency, and the approval controls mean a junior teammate can produce on-brand work without going off the rails. That's the whole point of this category. It trades the ceiling of a raw image model for guardrails and speed. When the job is volume and consistency, that's the right trade.
The second job is hero art. Original imagery you can't get from a template: the striking lead image for a blog post, a campaign visual, a product shot you don't have the budget to photograph. This is where image models earn their keep, and there are several worth knowing, each with a personality. And to be clear about what an image model is, it's a tool you give a written description to and it paints a brand-new picture from scratch, not a template you fill in. You'll usually access it through a web app or a chat box, no code, no installation.
Midjourney is the artistic one. The current default is Version eight point one, which became the default June tenth this year. Here's a quirk worth knowing: roughly sixty percent of paid users still force the older Version six point one for photorealistic portraits, because they trust its skin texture more. In blind tests, Version six was identified as a real photo forty-one percent of the time. So even within one tool, the newest version isn't automatically the right one for faces. Midjourney's pricing runs from ten dollars a month at the basic tier up to a hundred and twenty at the mega tier. One critical thing: the free tier gives you zero commercial rights. Paid plans give you a full commercial license, and if your organization makes over one million dollars in revenue you need the pro or mega tier. I'll come back to why that license matters, because a license is not the same as owning the copyright.
OpenAI's image generation used to be DALL-E three, but that was deprecated May twelfth this year and replaced by GPT Image one point five. The new model follows instructions better, edits more consistently, and is better at preserving a logo, a face, or a composition through edits. Commercial use is permitted on all tiers, but note this: on the free and Plus tiers, OpenAI may use your inputs and outputs to train its models, while the business, team, and API tiers don't. Pricing runs roughly four to twelve cents an image. And mark your calendar, because the current models are scheduled to deprecate December first this year and roll into GPT Image two.
Adobe Firefly is the trustworthy one, and this is the one your legal team will like. Firefly is trained on licensed Adobe Stock content, openly licensed content, and public-domain work. It is not trained on scraped web images. Because of that, every paid Creative Cloud plan includes intellectual-property indemnity defense, meaning Adobe will stand behind you if someone challenges an image. It's the only major tool that gives you both transparent training data and indemnification. It's native inside Photoshop, Illustrator, Premiere, and Express. And it has a feature called Custom Models that came out of beta this March. You train it on ten to thirty of your own reference images, it costs five hundred Firefly credits and takes about two hours, and afterward every image it makes inherits that consistent style, character, or photographic look. The model stays private in your workspace. That is your visual brand voice, locked in at the model level.
Ideogram is the text one. Ideogram four point zero released June third this year. Its claim to fame is text accuracy, around ninety to ninety-five percent, which for an image model is remarkable. Most models butcher words. Ideogram gets them right. On its web platform, paid plans give you commercial rights, but the free plan is public, meaning anyone can see and reuse what you make. There's also an open-weight release of four point zero, and that one is non-commercial unless you go through the API or an enterprise license.
Google's Imagen three is the underrated, cheap one. It's available through Google's Gemini interface, it's extremely photorealistic, and at about three cents an image it's the cheapest of the majors. Every output carries an invisible watermark Google calls SynthID, machine-readable only. Commercial use is permitted, though copyright is unclear without substantial human input, which is a theme you'll hear again today.
And Flux, from a company called Black Forest Labs, is the photorealism leader right now. Flux two pro produces skin texture, lighting, and realistic humans that are genuinely hard to tell from a photograph, which makes it excellent for product photography, stock replacement, and mockups. There's a proprietary version available only through an API, a source-available version that's non-commercial on its own, and a fully open-source version.
If you want the quick ranking for twenty twenty-six: for raw photorealism, Flux two leads, then Midjourney, then GPT Image one point five, then Firefly. For artistic and aesthetic control, Midjourney leads. For following your prompt accurately, GPT Image one point five and Midjourney are top. For rendering readable text, Ideogram is the clear winner. And for brand consistency, it's Firefly Custom Models and Midjourney's character reference. Don't memorize that. Test two or three on your own brand and feel the difference.
The third job is ad creative with text baked into the image. This is the hardest one, and I want to set your expectations honestly. Image models struggle with three things: hands and fingers, and dense in-image text like infographics, posters, and interface mockups. Ideogram is the exception on text, near reliable at ninety to ninety-five percent for short marketing copy. Midjourney can do a readable single or double line but isn't production-ready for dense text. So the professional workflow is this: generate a clean hero image with whichever model fits, then export it, bring it into Photoshop or Canva, and add your copy with the actual type tool so the text is guaranteed legible and on-brand. Or, if it's just one to three lines, let Ideogram bake it in. The rule is, add dense brand copy after the hero is generated, never expect the model to typeset for you.
Now, prompting. There's one structure that works across every image model, and once you internalize it you'll never stare at a blank prompt box again. Seven parts: subject, medium, style, lighting, framing, mood, palette. The subject is the main thing, a product, a person, a scene. The medium is photography, illustration, a three-D render, concept art. The style is film, digital, watercolor, photorealistic, minimalist. Lighting could be soft rim light, golden hour, or a studio key and fill. Framing is your composition, a fifty-millimeter close-up, a wide establishing shot, an overhead flat lay. Mood is warm, cool, energetic, corporate, playful. And palette is your colors, named or given as a specific value. So a full prompt sounds like: portrait of a barista, film photo, soft rim light, fifty-millimeter close-up, warm mood, teal and orange palette. That's it. Every clause is a dial you can turn.
Let me show you the same structure doing different work, so it sticks. Say you need a hero for a software company. You might write: a software dashboard on a laptop, three-D render, clean minimalist style, soft studio lighting, overhead flat lay framing, calm and professional mood, cool blue and white palette. Or for a food brand: a stack of pancakes on a plate, photorealistic photography, warm golden-hour light, fifty-millimeter close-up, cozy and appetizing mood, warm amber palette. Notice that the seven slots never change. Only the contents do. That's why this structure is worth memorizing. It turns a vague wish into a set of specific, adjustable instructions, and specificity is exactly what gets you out of generic, averaged-looking output.
Pick your aspect ratio before you generate, not after, because changing it later means regenerating. A one-to-one square is your universal fallback, compatible with over eighty percent of placements. A four-by-five portrait crop leads the Meta and Instagram feed and gives you thirty-three percent more vertical screen than a square. Sixteen-by-nine landscape is for desktop, presentations, and blog heroes. And nine-by-sixteen vertical is for Stories and Reels, which matters because over seventy percent of Meta impressions are mobile. If you're going to overlay copy, design for it in the prompt. Ask for a clean off-white textured background with generous negative space, and put the subject in the left or right third so you've got room for the words. A nice trick for a clean corporate look is to prompt for a Swiss grid, off-white background, black typography, one red accent, perfect legibility.
Then there are negative prompts, which are just instructions for what you don't want, and they're your cheapest insurance against ugly output. Tell the model: no watermark, no extra fingers, no text artifacts, no blurry text, no oversaturation. For skin and quality: no plastic skin, no waxy appearance, no mangled hands, no floating elements. And for brand safety: no logos, no real brand names, no trademarked imagery, no recognizable faces. That last group keeps you out of legal trouble, and we'll get there.
For consistency, you've got a few levers. Midjourney lets you pass a reference image with an image weight to influence style and composition, and it has a character reference feature with a character weight from zero to a hundred, where a hundred holds the face, hair, and clothing tightly and zero focuses on the face only. It works best with one subject per reference, and you should avoid real people because it gets inconsistent. Firefly's Custom Models, as I said, learn from ten to thirty of your images and bake your identity into every output. Canva and Adobe pull your brand colors from the brand kit automatically. And one term to demystify: a seed just sets the initial random layout. It does not bookmark a style or a character across prompts, so don't expect reusing a seed to give you the same look.
For editing and iteration, two words to know. Inpainting means you mask a region and the model fills just that area, either contextually to remove an object and heal the gap, or with a text prompt to regenerate exactly that spot. Outpainting, sometimes called generative expand, means extending the canvas beyond the original edges and letting the model fill the new space, which is how you add negative space or convert one aspect ratio into another. Photoshop's Generative Fill and Generative Expand, Firefly, and Canva's Magic Edit all do these. Most tools hand you four options per prompt. The mental model for refining is: coarse, then refine, then detail, then polish, then a brand check at the end.
Let's talk capabilities and limits plainly, because knowing where the models break saves you time. On text rendering, Ideogram four point zero is best at ninety to ninety-five percent, the old DALL-E three was around eighty, Midjourney, GPT Image, and Flux sit around seventy to seventy-five, and Firefly is weakest at about sixty. On photorealism, Flux two pro is effectively indistinguishable from photography, with Midjourney eight point one and GPT Image close behind, while Firefly is cleaner and safer but lags on outdoor and human scenes. On resolution, most of these default to around a thousand pixels square and let you set aspect ratios, and a few have no native upscaling, so you'd use a separate upscaling tool to enlarge. You don't need to track every number. Just know that hero models top out around fifteen hundred pixels and you upscale from there for print.
Now the heart of the whole episode: commercial safety and the law. This is the part that separates a marketer who uses AI images casually from one who can put them in a paid campaign and sleep at night.
Start with copyright, because it's counterintuitive. The US Copyright Office, in guidance from twenty twenty-five and affirmed in twenty twenty-six, says purely AI-generated images are not copyrightable. A prompt by itself is treated as an unprotectable idea or instruction. You only get copyright protection when a human contributes substantial authorship on top of the AI output. And this is settled law now, not a theory. In the Thaler versus Perlmutter case, the Supreme Court declined to hear the appeal in March this year, leaving in place the ruling that a machine can't be an author. So the practical consequence is real: if you generate an image and ship it untouched, you may not be able to stop a competitor from using the same image. Your protection comes from your human edits and your overall design. That alone is a reason to do the edit pass I'll describe.
Next, the license terms, which vary a lot by tool and are separate from copyright. Firefly is the most secure, with licensed and public-domain training data and intellectual-property indemnity on every paid Creative Cloud plan, with enterprise caps in the tens of thousands of dollars. Midjourney's free trial gives zero commercial rights and Midjourney actually owns those outputs, while paid plans give a full commercial license, and over one million in revenue requires the pro or mega tier. Remember, that license lets you use the image commercially, but it does not hand you a copyright. OpenAI permits commercial use on all tiers and assigns rights to you, but offers no copyright guarantee, and the free and Plus tiers may train on your content. Google's Imagen permits commercial use, needs substantial human input for any copyright, and watermarks everything. Ideogram's paid web plans are commercial, the free plan is public, and the open-weight model is non-commercial. So before you ship, you check two boxes: does the tool's license permit commercial use at your tier, and have you added enough human authorship to claim any protection.
There are two big lawsuits to keep on your radar, not because they're settled but because they shape the risk. In Getty Images versus Stability AI, the UK High Court ruled in November twenty twenty-five with a mixed result: the trademark claim was upheld because Getty's watermarks showed up in outputs, but the copyright claim was rejected on the theory that the model stores compressed representations rather than direct copies. And in the US, Andersen versus Stability AI, brought by a group of artists, has core copyright claims that survived dismissal and a trial date set for September eighth this year. That trial will test whether training on scraped images is itself infringement, and the answer will ripple across every scraped-data tool. None of this stops you from working today, but it's why a tool with clean, transparent training data is worth paying for.
Then there's brand, likeness, and trademark risk, which is the one most likely to bite a marketer specifically. If you prompt for a Coca-Cola logo or a Nike swoosh, you're courting trademark infringement. Use generic descriptors or original designs instead. Firefly is trained to refuse brand and trademark generation, while Midjourney and OpenAI have weaker guardrails, which means the responsibility falls on you. Celebrity likeness has gotten sharper too. Celebrities are now filing trademark registrations on their voice and image. Matthew McConaughey holds eight federal trademarks including his catchphrase and gestures, and Taylor Swift this April registered an audio phrase and an Eras Tour image. So generating something that evokes a real celebrity is a live liability, not a gray area. Even mimicking a living artist's style, while style itself isn't directly copyrightable, can trigger trademark, unfair-competition, or right-of-publicity claims. Steer clear.
Disclosure is the next pillar, and it's enforceable. The Federal Trade Commission, the F-T-C, issued guidance in March twenty twenty-five built on three ideas: transparency, meaning disclose substantial AI involvement; truthfulness, meaning your claims must be substantiated; and endorsements, meaning disclose AI-created apparent endorsements. In practice that can mean a dual label, both a sponsored tag and an AI-generated tag, clear and conspicuous, placed near the content. The penalty is about fifty-three thousand dollars per violation at the twenty twenty-six rate, and each post counts as a separate violation, so it compounds fast. On top of the federal rule, New York state passed a law that took effect June ninth this year requiring conspicuous disclosure whenever a synthetic performer, an AI face or voice, appears in a commercial ad, with fines of a thousand dollars for a first violation and five thousand after that.
The last pillar is provenance, the technical plumbing that proves where an image came from. The main standard is called C2PA, the Coalition for Content Provenance and Authenticity. Version two point one was ratified in twenty twenty-five and is now an international standard. It embeds a signed cryptographic manifest into the file recording who made it, with what tool, whether AI was involved, and the edit history. The signers include Adobe, Microsoft, Google, Meta, and OpenAI, and Adobe and Firefly embed these credentials automatically. Google's SynthID is the invisible watermark on Imagen outputs. And the EU AI Act, Article fifty, comes into full effect August second this year, requiring that synthetic images, video, and audio be machine-detectably marked. If you sell into Europe, that one's not optional.
That brings us to the pitfall, the thing that quietly damages brands: AI slop. Let me name the visual tells so you can spot them at a glance. Hands and fingers, with merged or extra digits and impossible joints, improving in the top models but still common in budget tools. Text artifacts, mangled letters and floating words. Waxy or plastic skin and uncanny faces, usually from cranking the guidance setting too high or using too few sampling steps. The generic stock-photo look, oversaturated, artificial smiles, corporate cliche, which is just the model averaging its training data. Warped geometry and impossible anatomy. And artificial, oversaturated colors. The fixes are mostly art direction: be specific, ask for documentary realism or editorial grit instead of leaving it vague, use those negative prompts, lower the guidance, add sampling steps, and name your palette.
Here's why slop is a brand risk and not just an aesthetic nitpick. Visual consistency is your brand voice in image form, the same way your written tone is your brand voice in copy. Visible slop signals low effort, cheap, inauthentic, and in twenty twenty-six there's a real perception gap where obviously AI imagery reads as cheap to your audience. The fix is the same discipline we taught for copy: a human review and edit pass. We called it human-in-the-loop, and it applies here exactly. The good news is it's about ten to thirty minutes of refinement per image, not the two to four hours of a custom photo shoot. So you keep most of the speed and you lose the slop.
Here's the human-in-the-loop image workflow concretely. Generate four to eight variations. Do a thirty-second brand check on each: does it match the brand voice and visual identity. Then spot-fix in Photoshop or Canva: fix hands with generative fill, fix mangled text by regenerating or replacing it with clean typography, pull down neon saturation with a curves adjustment, heal waxy areas. Optionally do a color grade to lock your palette. Then export and add content credentials and a copyright notice. And do a final legal check for any real brands, celebrities, or protected work that may have slipped in.
Now let me give you the copyable end-to-end workflow, the thing to actually run. Seven steps.
Step one, write the brief, about five minutes. Name the asset type, blog hero or social card or email header or ad creative. Pull in your brand voice from the profile you already built. Then specify subject, mood, palette, composition such as subject in the left third with negative space on the right for copy, and your aspect ratio and platform sizing.
Step two, pick the right tool for the job. Fast templated graphics and brand assets, use a hosted design tool like Canva. Hero image photorealism, reach for Flux two pro. A hero that needs both text and photorealism, Ideogram then Photoshop. A hero with artistic control or character consistency, Midjourney. On-brand product or character consistency, an Adobe Firefly Custom Model. A commercial-safe hero where you want indemnity, Firefly. Ad creative with copy baked in, Ideogram for one to three lines, Photoshop or Canva for anything denser.
Step three, prompt and generate, ten to twenty minutes, using that seven-part structure plus any reference and aspect-ratio settings, and let it give you four to start.
Step four, spot-check brand fit and slop, about five minutes. Subject matches the brief, colors match the brand kit, aspect ratio is right, hands and fingers are legible, any text is readable at full size, no waxy faces when you zoom in, it doesn't look like generic stock, and the composition leaves room for copy. If more than two of those fail, regenerate with a refined prompt rather than trying to rescue it in editing.
Step five, refine in the editor, ten to thirty minutes. In Photoshop, fix hands with object selection and generative fill, fix waxy skin with healing and curves, fix text with generative fill or a clean type overlay, lock color with curves, and export at twice the size you need. In Canva, use Magic Edit, swap backgrounds, add your brand-kit text, and export. Add content credentials.
Step six, clear commercial use, about two minutes. No real brand logos, no celebrity or recognizable person, no copyrighted artwork. Confirm the tool's terms permit commercial use at your tier. And add an AI-assisted disclosure if it's going into a paid ad, to satisfy the F-T-C and New York.
Step seven, export at the right sizes, about five minutes. Meta feed at four-by-five, Stories and Reels at nine-by-sixteen, carousel square. Google Ads landscape, square, and portrait, kept under five megabytes. Blog hero at sixteen-by-nine, email hero responsive, screen images at seventy-two dots per inch, print at three hundred and in the print color space. A hosted tool's resize feature can spit out every platform variant in one click.
A light word on cost, because the next episode owns the deep dive. A hosted design tool runs around fifteen dollars a month. Image-model subscriptions range from ten to a hundred and twenty dollars for Midjourney, twenty for ChatGPT Plus, and Photoshop on its own is around fifty-five a month. Pay-per-use is pennies, roughly three to twelve cents an image depending on the tool. The hidden cost is your edit time, that ten to thirty minutes per image. A rough framing: a stack around a hundred and thirty dollars a month versus a freelance designer at fifty to a hundred and fifty an hour pays for itself in weeks if you're making twenty or more assets a month. We'll get into credits versus subscriptions and cost-per-finished-asset next time.
Let me tie the legal pieces together one more time, because it's easy to hear a stack of rules and forget which ones actually apply to you. Copyright is about whether you can stop others from copying your image, and the answer is mostly no unless you've added real human work. License is about whether you're allowed to use the image commercially at all, and that's set by your tool and your tier. Trademark and likeness are about not borrowing someone else's brand or face. Disclosure is about telling your audience and regulators that AI was involved. And provenance is the invisible record that proves it. Five different questions, five different answers, and a finished marketing image has to clear all five.
So that's the whole arc. Three jobs, one prompt structure, the legal core that lets you actually ship, and a human edit pass that keeps it off the slop pile. Pick one real asset on your desk this week, run the seven steps, and feel where it's faster and where the judgment still has to be yours.