Generate Images, Voiceovers, and Music Without Leaving Stella
By Stella Team
To create one product video, you currently need Midjourney for images, ElevenLabs for voiceovers, Suno for music, stock sites for footage, and your video editor to put it all together.
That's 5 tabs, 5 subscriptions, and 5 different interfaces. Every time you need an asset, you leave your editor, generate it elsewhere, download it, and import it back.
Context switching kills productivity
Every tab switch = 23 minutes to refocus
Stella consolidates all of this into one interface. Generate what you need without leaving your timeline.
AI Image Generation
How It Works
In the AI chat, describe the image you need:
"Generate a lifestyle image of a woman in her 30s drinking coffee from a ceramic mug, sitting by a window with soft morning light, cozy aesthetic"
Stella generates a high-quality image and automatically adds it to your media library, offers to insert it into your timeline, and matches your project's aspect ratio.
Prompt Structure for Best Results
Great image prompts follow this formula:
[Subject] + [Action/Pose] + [Setting] + [Lighting] + [Style/Mood]
| Use Case | Prompt Example |
|---|---|
| Product hero | "Ceramic mug on marble counter, soft studio lighting, minimal background" |
| Lifestyle | "Hands holding smartphone, outdoor café, natural daylight, candid moment" |
| Texture/detail | "Extreme close-up of leather texture, shallow depth of field, warm tones" |
| Flat lay | "Top-down skincare products on white surface with eucalyptus leaves" |
Style Control
Product Hero
Clean, studio-lit
Lifestyle
Natural, contextual
Macro Detail
Texture focus
Flat Lay
Top-down arranged
- "Product photography style" → Clean, professional, studio-lit
- "Editorial style" → Magazine-quality, aspirational
- "UGC style" → Casual, authentic, smartphone-look
- "Cinematic" → Dramatic lighting, film-like color grading
What it's good at
- ✅ Product photography (flat lay, hero shots, detail shots)
- ✅ Lifestyle imagery (people using products, environments)
- ✅ Backgrounds and textures
- ✅ Abstract and graphic elements
What to avoid
- ❌ Exact brand logos or trademarked items
- ❌ Specific real people's faces
- ❌ Complex text within images (use Stella's text tools instead)
AI Voiceover Generation
Available Voices
Stella includes multiple professional voice options:
| Voice | Characteristics | Best For |
|---|---|---|
| Warm British Male | Trustworthy, articulate | Product explainers, B2B |
| Friendly Female | Approachable, conversational | Tutorials, lifestyle brands |
| Soft Female | Gentle, calming | Wellness, luxury brands |
| Authoritative American Male | Confident, commanding | Sales videos, announcements |
| Energetic Young Male | Upbeat, enthusiastic | Promos, youth-focused |
| Calm Young Male | Relaxed, genuine | Tech products, apps |
Creating Voiceovers
Option 1: Write your script
"Create a voiceover using the warm British male voice: 'Introducing the future of home audio. Crystal clear sound, thoughtfully designed.'"
Option 2: Let Stella write it
"Write and generate a voiceover for this product video. Keep it under 15 seconds, highlight the key benefit."
Tips for natural-sounding voiceovers
Punctuation affects delivery:
- Commas create natural pauses
- Periods create longer pauses
- Em dashes create dramatic pauses
Weak: "Our product is great it saves time and money and everyone loves it"
Strong: "Our product is great. It saves time, and money. Everyone loves it."
Keep sentences short. Long sentences sound breathless when generated.
AI Music Generation
How It Works
Describe the music you need:
"Generate 30 seconds of calm acoustic background music with light piano and soft guitar, no vocals, suitable for a premium product video"
Prompt Structure for Music
[Duration] + [Mood] + [Instruments/Genre] + [Tempo] + "no vocals"
| Video Type | Music Prompt |
|---|---|
| Product launch | "20 seconds upbeat electronic, energetic but not overwhelming, no vocals" |
| Testimonial | "30 seconds soft ambient piano, minimal, emotional but not sad, no vocals" |
| Luxury brand | "25 seconds elegant orchestral, strings and piano, sophisticated, no vocals" |
| Tech product | "30 seconds modern electronic, clean and futuristic, medium tempo, no vocals" |
Always Specify "No Vocals"
Generated vocals often sound artificial. For background music in videos, instrumental tracks work better. Always include "no vocals" or "instrumental only" in your prompt.
The Integrated Workflow
Here's how generation works within a real project:
Parallel Asset Generation
All assets generate simultaneously — no waiting
- Describe the video: "Create a 30-second launch video for wireless earbuds"
- Stella identifies missing assets: Product hero, lifestyle shot, feature icons, background music, voiceover
- Everything generates in parallel: While you refine your script, images and music generate simultaneously
- Assets appear in your timeline: Generated content automatically integrates with your project
Generation vs. upload: When to use each
| Situation | Generate | Upload |
|---|---|---|
| You need product photos but don't have them | ✓ | — |
| You have professional photos from a shoot | — | ✓ |
| You need background music | ✓ | — |
| You have licensed music you want to use | — | ✓ |
| You need lifestyle imagery | ✓ | — |
| Your founder wants to record their own voice | — | ✓ |
Use AI generation for supplementary content and filler. Use real assets for hero moments and authenticity.
Cost comparison
Cost Per Video
$750–3,000+
Monthly savings for teams creating 10+ videos
Traditional Approach (per video)
- Stock images: $10–50
- Voiceover artist: $50–200
- Licensed music: $15–50
- Total: $75–300
Stella Approach
- All generation included in subscription
- Total per video: $0 additional
For teams creating 10+ videos per month, this represents $750–3,000+ in monthly savings.
Try it now
Generate your first image, voiceover, or music track in under a minute.
Share your best generations in our Discord. We feature community creations every week.
Try what you just learned
Stella includes everything you need to make professional videos: templates, stock footage, AI tools. Your first video is free.