AI Video Pipeline Case Study

CREATING A PIPELINE

Dispel the “magic black box” way of seeing AI, and learn how I problem solve / chain multiple tools together to create a video like this one.

SUMMARY

Brief: Make a rough, internal concept video showing a metallic surface 'birthing' a bottle, using AI.

Chose a model that allowed me to use a start frame and end frame, meaning the video starts with Frame A and then creates the animation into Frame B.

I already had the end frame (the source image), so I needed to make the start frame by passing the original image between Nano Banana (Google Gemini) and Photoshop, and then feeding both frames into Luma Dream Machine to create the video. I needed to be more precise than using normal English, so I used ChatGPT to translate my prompts from English to JSON, a programming language that lets me communicate in detail with AI.

TRANSLATION LAYER

Generate start frame from original/end frame

Feed start/end frame to video model to create video

DETAILED PROCESS

1 DECIDE ON APPROACH

Bottle will be “birthed” from metallic copper column, which starts center frame. Camera will dolly in and rotate to the side as bottle emerges. First step is to create the starting frame image of copper column, alone and centered, so it can be animated into the final image.

2 REMOVE THE BOTTLE

Begin creating image by removing bottle from image with Nano Banana. Work with GPT to translate normal prompt into JSON, to feed into Nano Banana. Multiple passes fail, but produce this image which can be fixed in Photoshop.

TOOLS: GPT, Nano Banana

3 REMOVE UNWANTED CONTENT P. 1

Use Photoshop Generative Fill to remove white glass sheet.

TOOLS: Photoshop

4 REMOVE UNWANTED CONTENT P. 2

Use Nano Banana to remove lens flare.

TOOLS: Nano Banana

5 CLEAN UP TEXTURE

Use Photoshop Generative Fill to remove rough texture from column.

TOOLS: Photoshop

6 ZOOM OUT CAMERA

Create JSON prompts with GPT for Nano Banana, to generate this column with strange light source on left side.

TOOLS: GPT, Nano Banana

7 REMOVE UNWANTED REFLECTIONS

Used Photoshop Generative fill to remove the light source on left side, finalizing start frame.

TOOLS: Photoshop

8 ANIMATE

Had GPT translate from English into JSON prompt, fed start and end frame to Luma Dream Machine (video AI model). Made 2 versions, used upscale function to finalize.

TOOLS: Luma Dream Machine

FINAL PROMPT FOR GPT TO TRANSLATE

“Help me create a prompt for Luma Dream Machine. The first frame is the image of the copper metallic column. I want the column to "give birth" to the bottle, stretching a membrane out with it to end with the bottle image I've uploaded. I want the camera to zoom in as the bottle is emerging from the column and to rotate to the side as it's coming out. So, when we first see the column opening up, it's from the front. As the bottle emerges, the camera should rotate to the side until it's in the position for the final image. Please create the prompt in detailed JSON format.” (Uploaded first frame image, last frame image)

FINAL GPT-TRANSLATED JSON PROMPT FOR LUMA VIDEO MODEL

{
  "subject": {
    "form": "copper metallic column",
    "emerging_object": "copper-toned glass bottle",
    "membrane": "translucent elastic film stretching as bottle emerges"
  },
  "action": {
    "transition": "column opens, bottle pushes outward",
    "motion": "fluid expansion of membrane"
  },
  "camera": {
    "start": "frontal view, centered, slight zoom",
    "mid": "rotate smoothly toward three-quarter view as bottle emerges",
    "end": "side view matching final image, tight zoom on bottle",
    "lens": "cinematic 50mm, shallow depth of field"
  },

"lighting": {
  "style": "soft studio, copper highlights, clean reflections"
},
"style": {
  "genre": "surreal cinematic",
  "realism": "photorealistic metallic and glass textures"
},
"negatives": {
  "exclude": [
    "static single angle",
    "no rotation",
    "multiple bottles",
    "cartoon look"
  ]
},
"weights": {
  "camera_rotation": 4,
  "emergence": 3,
  "membrane": 2
},
"seed": 77
}

BUSINESS OUTCOME

This pipeline does not replace talented human designers, but eliminates wasted time and budget in production. 3D animation with an agency can cost $15,000 per week for a single 8-second video. The above method of chaining AI tools together lets us evaluate an idea before spending any of our budget. No more using agencies as expensive brainstorming tools. AI allows us to know what we want in one day, saving us weeks of ideation and tens of thousands per asset.

CREATING A PIPELINE

SUMMARY

DETAILED PROCESS

1 DECIDE ON APPROACH

2 REMOVE THE BOTTLE

3 REMOVE UNWANTED CONTENT P. 1

4 REMOVE UNWANTED CONTENT P. 2

5 CLEAN UP TEXTURE

6 ZOOM OUT CAMERA

7 REMOVE UNWANTED REFLECTIONS

8 ANIMATE

FINAL PROMPT FOR GPT TO TRANSLATE

FINAL GPT-TRANSLATED JSON PROMPT FOR LUMA VIDEO MODEL

BUSINESS OUTCOME

Reach out and say hi.