All video prompts

How to create a Street Food Spice Trial video

veo-3.1-fast9:168s

The prompt

Subject: Traveler gripping a dented tin plate heaped with a fiery local street snack; olive linen button-up with sleeves rolled, worn leather shoulder bag with a bamboo straw poking out. Action: (0:00–0:07) Turns to lens with a wide grin and lifts the plate; dips camera/plate downward briefly to showcase the dish; leans back in with a low, conspiratorial smirk; delivers hook lines; holds a single silent beat; closes with "let's find out." Scene: Late-night hawker alley in Chiang Mai; amber lanterns and strung Edison bulbs overhead; steam rising from clay pots and flat iron griddles; wooden push carts; low rattan stools; hand-chalked menu boards; motorbikes nudging through the crowd; customers clustered around stalls; damp cobblestones mirroring orange and pink light; chili paste glistening with dried chilies, toasted peanuts, lime zest, and torn holy basil; vendor spooning sauce in the background; foot traffic adds layered depth. Style: Vertical 9:16, handheld selfie at arm's length, 24–26mm phone-wide; traveler framed in upper third with stall and plate anchoring lower frame; single continuous take with optional micro whip or tap-to-focus for B-roll inserts; warm practical lantern light as key, pink neon as rim; natural "phone-real" color grade — no heavy processing. Camera is the phone in the traveler's hand at chest-to-eye level. Dialogue: Traveler says: "I'm in Chiang Mai — locals swear this is the hottest thing on the street." Traveler says: "Look at that chili paste… it's practically glowing." (amused disbelief) Traveler says: "Apparently most visitors can't make it past the first bite." (conspiratorial lean) Traveler says: "Let's find out." (confident grin) Voice-Over: None — all lines are direct on-camera delivery. Sound Effects: Traveler's voice clean and upfront; ambient hawker alley bed — griddle hiss, iron spatula scrapes, vendor shouts, crowd murmur (mixed low-to-mid); no unwanted music (optional ultra-low lo-fi texture kept under −20 LUFS); short sizzle accent timed to the plate tilt-down. NEGATIVE: subtitles, captions, watermarks, text overlays, logos, poor lighting, low resolution, compression artifacts, oversaturation, over-sharpening, inconsistent character appearance, cartoonish skin, distorted hands, audio sync issues, banding, jitter beyond natural handheld wobble, extreme motion blur, unwanted crowd screams.

How it works

  1. 1Tweak the prompt or pick a different model.
  2. 2Hit Generate — your clip renders in seconds.
  3. 3Open it in the editor to build a full video.

More video prompts