AI Video Gen
📽️

AI Video Gen

Tags
Here is what I’ve learnt about AI video gen when I was doing my research late last year.

Top Platforms

  1. Runway https://runwayml.com/
  1. Luma Labs https://lumalabs.ai/
  1. Kling https://klingai.com/global/
  1. Haliou https://hailuoai.video/
  1. Minimax
  1. Sora https://sora.chatgpt.com/explore
  1. Pika https://pika.art
Some of these platforms have their own benefits. Like some work better with people, some with landscapes.

Workflow for AI video gen

Don’t do text → video. The results are way worse than doing text → image and high-res image → video.
So that means that the workflow would be:
  1. (Optional) Use ChatGPT / Claude to prepare a good image prompt
  1. Use that image prompt on models like Midjourney, Flux 1.1 Pro https://replicate.com/black-forest-labs (no subscription - pay per use) or ChatGPT 4o image model on the paid version
  1. Upscale that image using another tool https://magnific.ai/
  1. Now go to the video gen platform and use a mix of image + text written prompt to create the 5-10s video
  1. You can alternatively and recommended create two image keyframes - start frame and end frame. So that the video model has a clear idea of how the image should be animated. And you can now animate the middle via interpolation or motion paths or motion brushes.
  1. So use a combination of image reference, text prompt, and video tool settings (some of the AI platforms have advanced settings that you can use)

Insights

  • Prompts
    • Keep image-to-video text prompts concise—neither too long nor too short
    • Include cinematic descriptors (e.g. “dramatic,” “wide-angle,” “soft focus”)
    • Learn the intricacies of camera framing that a director of an ad / movie / show / film would know — for e.g. long shot, close shot etc Consistency
    • Lock seed or style parameters across steps (Midjourney, Fal.ai)
    • Export 10 reference images to train Flux LORA for uniform look
    • ChatGPT can also be used
  • Motion Strategies
    • Use simple motion paths for objects/characters
    • For landscapes, add parallax or subtle camera pans
  • Keyframe Definitions
    • Start frame: establish setting and mood
    • End frame: illustrate transformation or reveal

BG Music

You can use the following platforms for bg music:
  • Suno
  • Udeo
Since all the video gen tools allow for 5-10s clips, you can create the music that has a beat shift every 5 - 7 - 10s to match the video.

Talking videos

If you want to do talking heads with lipsync you can use: