Key takeaways
- The feed is muted and vertical by default: Start from the real viewing condition, because it is not the exception.
- What "works on mute" actually requires: An ad that works on mute is not an ad with captions bolted on.
- Vertical is not landscape with bars: The other half of the format is the frame, and the common mistake is treating vertical as landscape with the sides removed.
There is a gap between how ads are made and how they are watched. They are made on big horizontal monitors, in quiet rooms, with the sound up. They are watched on a phone held upright, in a noisy place, with the sound off, in a feed that is moving. The ad that was designed for the first condition and merely adapted to the second is the ad that quietly underperforms, not because the idea was weak, but because the format it actually runs in was an afterthought.
This is not a new problem, but it is a newly cheap one to solve. For most of video's history, designing properly for the sound-off vertical feed meant either shooting twice or accepting a compromise. AI lowers that cost enough that there is no longer an excuse to treat the feed as a place you crop down to. The format you are actually designing for can be the format you design from.
The feed is muted and vertical by default
Start from the real viewing condition, because it is not the exception. It is the rule. A large majority of feed video is watched with the sound off, at least at first; people scroll in offices, on transit, in bed beside someone asleep, and they decide whether to keep watching long before they decide whether to turn the sound on. The phone is vertical because that is how it is held, and turning it sideways is friction almost no one accepts for an ad.
So the honest default is this: silent, vertical, in motion, competing with a thumb. Any creative decision that assumes otherwise (that the music will carry the mood, that the voiceover will explain the product, that the wide composition will read) is designing for a screen most of your audience is not using. The sound and the landscape framing are bonuses for the minority who opt in, not the channel through which the message arrives.
What "works on mute" actually requires
An ad that works on mute is not an ad with captions bolted on. It is one where the meaning survives the sound being gone. That is a higher bar than subtitling, and it shapes the whole creative. The visual has to carry the story on its own; the captions have to be designed, not transcribed; and the first frame has to communicate before anyone has chosen to engage.
Captions are the clearest example of the shift. Treated as accessibility, they are an afterthought: small, generic, dropped at the bottom. Treated as design, they become a primary visual element: large, paced to the cut, positioned where the eye already is, carrying the line that the voiceover would have carried if anyone could hear it. In a muted feed, the caption is not a transcript of the ad. Often it *is* the ad's voice.
If your ad stops making sense the moment you hit mute, you have not made a feed ad. You have made a TV ad that happens to be playing in a feed, and the feed will treat it accordingly.
Vertical is not landscape with bars
The other half of the format is the frame, and the common mistake is treating vertical as landscape with the sides removed. It is not. A vertical frame is a different composition: it favours a single subject over a wide scene, foregrounds faces and product over environment, and rewards motion that runs up and down the frame rather than across it. Crop a horizontal shot to vertical and you usually lose the subject, the context, or both, which is why letterboxed ads with black bars top and bottom read instantly as repurposed and get scrolled past.
Designing vertical from the start changes the shot, not just the aspect ratio. A few rules earn their keep:
- Open on the visual, not the wind-up. The first frame has to do work before any sound or context arrives: a clear subject, a recognisable product, a visual hook that reads in the half-second before the thumb decides.
- Caption as a design layer. Size, timing, and placement of text are creative decisions, not a post-production checkbox. Keep them clear of the platform's UI overlays and inside the part of the frame the eye actually lands on.
- Compose for the upright frame and its safe zones. Keep the subject and the key message away from the very top and bottom, where platform chrome and interface elements crop into the picture, and build the shot around a single focal point rather than a wide scene.
None of these are exotic. They are simply what it means to design for the screen the ad runs on instead of the screen it was edited on.
Where AI helps, and where it cuts corners
The reason this is worth raising now is that AI changes the economics of doing it right. Producing a genuinely vertical, mute-legible cut (captioned as a design element, framed for a single subject, opening on a strong first frame) used to add cost and time on top of the "main" landscape version. When variants are cheap, the vertical sound-off cut stops being a downgraded export and becomes a first-class version you can actually design and test.
But cheap also makes it easy to cut the corner the format cares about most. Auto-captions that mistime the cut, a horizontal clip lazily padded to fill a vertical frame, generic text slapped on without regard to where the eye goes. AI will produce all of these instantly, and they all read as exactly what they are. The format does not get more forgiving because the production got cheaper. If anything, a feed full of competent vertical creative makes the half-effort version stand out for the wrong reason.
The teams that win the muted, vertical feed are not the ones who add captions last. They are the ones who treat sound-off, upright, in-motion viewing as the brief, designing the first frame to stop the scroll without sound, the captions to carry the message, and the composition for the screen people actually hold, and who use cheap AI variation to make that the default version, not the afterthought.
Sources
- Meta, "Designing for sound-on and sound-off: best practices for mobile feed video," Meta for Business insights, 2025.
- Think with Google, "Vertical video and the realities of mobile viewing," 2024.
- Verizon Media / Publicis, "The sound-off majority: how people really watch mobile video," 2024.
Frequently asked questions
- What should marketing teams know about The feed is muted and vertical by default?
- Start from the real viewing condition, because it is not the exception.
- What should marketing teams know about What "works on mute" actually requires?
- An ad that works on mute is not an ad with captions bolted on.
- What should marketing teams know about Vertical is not landscape with bars?
- The other half of the format is the frame, and the common mistake is treating vertical as landscape with the sides removed.

