Dispel the “magic black box” way of seeing AI, and learn how I problem solve / chain multiple tools together to create a video like this one.
Brief: Make a rough, internal concept video showing a metallic surface 'birthing' a bottle, using AI.
Chose a model that allowed me to use a start frame and end frame, meaning the video starts with Frame A and then creates the animation into Frame B.
I already had the end frame (the source image), so I needed to make the start frame by passing the original image between Nano Banana (Google Gemini) and Photoshop, and then feeding both frames into Luma Dream Machine to create the video. I needed to be more precise than using normal English, so I used ChatGPT to translate my prompts from English to JSON, a programming language that lets me communicate in detail with AI.
Bottle will be “birthed” from metallic copper column, which starts center frame. Camera will dolly in and rotate to the side as bottle emerges. First step is to create the starting frame image of copper column, alone and centered, so it can be animated into the final image.
Begin creating image by removing bottle from image with Nano Banana. Work with GPT to translate normal prompt into JSON, to feed into Nano Banana. Multiple passes fail, but produce this image which can be fixed in Photoshop.
TOOLS: GPT, Nano Banana
Use Photoshop Generative Fill to remove white glass sheet.
TOOLS: Photoshop
Use Nano Banana to remove lens flare.
TOOLS: Nano Banana
Use Photoshop Generative Fill to remove rough texture from column.
TOOLS: Photoshop
Create JSON prompts with GPT for Nano Banana, to generate this column with strange light source on left side.
TOOLS: GPT, Nano Banana
Used Photoshop Generative fill to remove the light source on left side, finalizing start frame.
TOOLS: Photoshop
Had GPT translate from English into JSON prompt, fed start and end frame to Luma Dream Machine (video AI model). Made 2 versions, used upscale function to finalize.
TOOLS: Luma Dream Machine
“Help me create a prompt for Luma Dream Machine. The first frame is the image of the copper metallic column. I want the column to "give birth" to the bottle, stretching a membrane out with it to end with the bottle image I've uploaded. I want the camera to zoom in as the bottle is emerging from the column and to rotate to the side as it's coming out. So, when we first see the column opening up, it's from the front. As the bottle emerges, the camera should rotate to the side until it's in the position for the final image. Please create the prompt in detailed JSON format.” (Uploaded first frame image, last frame image)
{
"subject": {
"form": "copper metallic column",
"emerging_object": "copper-toned glass bottle",
"membrane": "translucent elastic film stretching as bottle emerges"
},
"action": {
"transition": "column opens, bottle pushes outward",
"motion": "fluid expansion of membrane"
},
"camera": {
"start": "frontal view, centered, slight zoom",
"mid": "rotate smoothly toward three-quarter view as bottle emerges",
"end": "side view matching final image, tight zoom on bottle",
"lens": "cinematic 50mm, shallow depth of field"
},"lighting": {
"style": "soft studio, copper highlights, clean reflections"
},
"style": {
"genre": "surreal cinematic",
"realism": "photorealistic metallic and glass textures"
},
"negatives": {
"exclude": [
"static single angle",
"no rotation",
"multiple bottles",
"cartoon look"
]
},
"weights": {
"camera_rotation": 4,
"emergence": 3,
"membrane": 2
},
"seed": 77
}This pipeline does not replace talented human designers, but eliminates wasted time and budget in production. 3D animation with an agency can cost $15,000 per week for a single 8-second video. The above method of chaining AI tools together lets us evaluate an idea before spending any of our budget. No more using agencies as expensive brainstorming toolsa. AI allows us to know what we want in one day, saving us weeks of ideation and tens of thousands per asset.