AI Video Gen

AI video generation has come a long way. Having tried out several tools -- Hailio, Kling, Runway, and others -- I wanted to share what I have learned about the current state of text-to-video and image-to-video AI.

The Current Landscape

The AI video generation space is moving fast. The major players right now include:

Key Insight: Start with an Image

The biggest lesson I have learned: always start with an image first. Think of it as image-to-video rather than text-to-video. You can use AI image generation tools like Flux or even Grok to create your starting frame, then animate it with a video generation tool.

This approach gives you far more control over the final output than trying to describe everything in a text prompt. Text prompts alone often lead to unpredictable results, especially with tools like Sora where overly detailed prompts can actually hurt quality.

How I Think About These Tools

Most professional creators are combining multiple tools in their workflow:

  1. Create a key frame using an image generation tool like Midjourney or Flux

  2. Animate it using Runway, Kling, or another video model

  3. Add finishing touches like lip-sync, upscaling, or compositing

What to Watch For

The space is evolving rapidly. Google's Veo model is pushing the boundaries of photorealism, and open-source models are catching up fast. The tools that will win are the ones that give creators the most control while keeping the workflow simple.

If you are just getting started with AI video generation, my advice is to pick one tool, learn its strengths and limitations, and build from there. Do not try to master everything at once -- the landscape changes too quickly for that to be a productive strategy.

Nov 1, 2024 · 3 min read

Enjoyed this post?

I write a newsletter on product, AI, and startups called The Discourse with 5K+ subscribers. Deep dives, no fluff.

Subscribe to The Discourse →