Creating a 100% AI-Generated Video with Google Veo 3

July 22, 2025
Emmie - VEO3

Mistakes to avoid, current limitations, and practical tips

After spending weeks producing a presentation video for our virtual agent Emmie—entirely generated using AI (specifically Google Veo 3)—one thing became clear: AI video generation, while promising, is still a challenging process.

Here’s a breakdown of what we learned:
what not to do, what still doesn’t work, and how to get the best out of Veo 3.


5 Mistakes to Avoid

1. Underestimating the 8-second limit

Every Veo 3 scene is capped at 8 seconds. Period.
This technical limit has a direct impact on scripting, pacing, and scene transitions. We had to rewrite much of our script to fit this constraint.

Don’t: plan for long shots, continuous dialogue, or complex transitions.

2. Assuming visual consistency

Even with ultra-detailed prompts, faces, gestures, and clothing varied from one scene to the next. Emmie never looked exactly like “Emmie” twice.

Don’t: rely on character continuity without planning for heavy post-production.

3. Recording voiceover too early

Since Veo scenes are silent, the voice needs to be created separately (we used AI for that too). But syncing a pre-recorded voice to unstable visuals is… painful.

Don’t: lock in narration before your visuals are edited and final.

4. Ignoring generation costs

One generation = 100 credits (~$1).
Multiply that by 6–8 scenes × 15–20 variations per scene, and… you get the idea.

Don’t: prompt blindly. Always budget and plan for iteration.

5. Relying on automatic subtitles

Veo adds subtitles directly into the video with no option to turn them off. If you don’t want them, you’ll need to manually remove or mask them.

Don’t: enable subtitles unless you’re 100% sure you’ll use them.


The Current Limits of AI Video Generation

  • Visual inconsistencies: lighting, proportions, character traits vary between scenes

  • Unnatural gestures: body movement often feels rigid or robotic

  • Limited control: camera angles, framing, action direction are unpredictable

  • No built-in audio: visuals are silent; sound must be added separately

  • Low personalization: keeping a character visually stable across scenes is very difficult


Our Top Tips for Working with Veo 3

🎬 1. Stick to 2–3 second shots

You don’t need to use all 8 seconds. The most usable clips are often short and focused.

✍️ 2. Find the right prompt balance

Too vague = chaos. Too specific = rigidity. Experiment until you find the sweet spot.

♻️ 3. Reuse and iterate on good prompts

Once you find a tone or aesthetic that works, duplicate it and adjust one variable at a time.

🧩 4. Plan for serious editing

Think in fragments. You’ll likely need to cut, crop, re-sequence, and mask your way to coherence.

🎛 5. Build your audio layer separately

Since visuals come without sound, plan a parallel audio workflow: narration, music, sound design, etc.


In Summary

Creating a video with Google Veo 3 is both exciting and frustrating.
You need to balance technical constraints, creative instability, and unexpected costs. But with patience and structure, it’s possible to produce a surprisingly coherent result—without a single actor, camera, or studio.