Skip to main content

How to use the 3D to Image / Image to Video feature?

Updated over a week ago

Our 3D to Image tool lets you quickly generate high-quality images based on the 3D scenes you build. With AI, you can preserve layout and composition while creating more expressive visuals, which can then be extended into dynamic video content.


1. Accessing the 3D Editor

You can enter through:


2. Create or Open a Scene

In the scene management panel, you can:

  • Create a new scene: Start from a blank scene or a preset template

  • Open a saved scene: Resume editing previous work

  • Rename/Delete scenes: For easier management


3. Build Your 3D Scene

In the editor, you can:

  • Add models: Select from the asset library

  • Adjust models: Rotate, scale, and move with Gizmo controls

  • Adjust the camera:

    • Left-click drag = Orbit

    • Right-click drag = Pan

    • Scroll = Step zoom

    • Middle-click drag = Smooth zoom

  • Adjust the environment: Change background, lighting, or ground color

This step defines the spatial layout and composition of your final image.


4. Add Prompts and References

In the right-side generation panel:

  • 3D Reference (auto): A snapshot of your 3D scene is used as a hard constraint for layout and perspective

  • Text Prompt (required): Describe style and details

(e.g. Medieval town square with a central cobblestone path leading to a large twin-towered castle gate, flanked by a vibrant fabric and food market on the left and a blacksmith forge on the right. Barrels, crates, and two wooden carts—one with hay, one with supplies—are scattered in the foreground. Balanced wide-angle front view, buildings and towers symmetrically positioned in the background.

Ultra-detailed textures, cinematic lighting, high dynamic range, enhanced material realism, slightly foggy medieval atmosphere.)

  • Image Prompt (optional): Upload extra reference images for characters, backgrounds, etc. (GPT-Image only)

💡 Tip: You can refer to specific images in your prompt, e.g. Use image #2 as background. (The default 3D Reference is image #1).

  • Input Fidelity:

    • Expressive: Produces more detailed and visually expressive images, but may deviate from the scene reference.

    • Consistent: Follows the scene reference more strictly, ensuring higher consistency.


5. Generate Images

Click Generate Image:

  • GPT Model: Higher quality, more expressive, but may deviate from references

  • Flux Model: Faster, more consistent with 3D depth, but requires precise prompts

After generation, results appear in the image list where you can:

  • Preview in full size

  • Reuse the prompt

  • Delete or save results


6. Generate Videos

You can enter through:

  • Choose duration (5s / 10s) and aspect ratio 1:1, 16:9, 9:16(text to video only)

  • The image will be used as the starting frame for video generation

  • Preview, download, or reuse prompts in the results


7. Saving & Revisiting

  • The system automatically saves your scene progress

  • You can always return to scene management to continue editing or generate from past scenes

Did this answer your question?