Vidu: Create Video Generation Task (Text, Image, Frames)

Submit an asynchronous video generation task using Vidu’s AI models. The API returns a task id immediately — you then poll the Query Video Task endpoint until the task reaches completed or failed status. Vidu supports pure text-to-video, image-to-video (driven by a first frame), and first/last frame guidance mode for precise scene control.

Base URL

https://zcbservice.aizfw.cn/kyyReactApiServer

Endpoint

POST /v1/vidu/videos

Authentication

Include your API key as a Bearer token in every request:

Authorization: Bearer YOUR_API_KEY

Models

Vidu offers two models with different performance characteristics:

Model	Description
`viduq3-pro`	Higher quality output — richer motion, more vivid and cinematic results
`viduq3-turbo`	Faster generation — same capabilities, reduced wait time

Use viduq3-pro when output quality is the priority. Use viduq3-turbo when you need faster turnaround for iteration or previews.

Request Parameters

model

string

required

The model to use for generation.Supported values:

viduq3-pro — higher quality, cinematic output
viduq3-turbo — faster generation

prompt

string

required

A text description of the video content you want to generate. Be specific about subject, action, camera movement, and visual style.Example: "A cat dancing in the rain, cinematic style"

duration

number

Output video length in seconds. Defaults to 5.

Supported range: 1–16 seconds
Default: 5

aspect_ratio

string

Output video aspect ratio. Defaults to 16:9.

Value	Description
`16:9`	Landscape (default)
`9:16`	Portrait
`1:1`	Square
`4:3`	Standard screen
`3:4`	Portrait standard

resolution

string

Output resolution. Defaults to 720p.

Value	Description
`540p`	Standard definition
`720p`	HD (default)
`1080p`	Full HD

Image-to-Video Parameters

first_image

string

URL of the first-frame image. Enables image-to-video mode.

Used alone: drives image-to-video generation from this starting frame
Used with last_image: enables first/last frame mode — the model transitions from the first frame to the last

Accepted formats: jpeg, png, webp
Value must be a publicly accessible URL.

last_image

string

URL of the last-frame image. Enables first/last frame guidance mode.

Must be used together with first_image — cannot be used alone
The model generates a smooth transition between the two frames

Accepted formats: jpeg, png, webp
Value must be a publicly accessible URL.

last_image cannot be used without first_image. Both fields are required for first/last frame mode.

Response Fields

string

Unique task identifier. Save this value — you’ll use it to poll the Query Video Task endpoint.

object

string

Object type. Always "video".

created

integer

Unix timestamp of when the task was created.

model

string

The model name used for this task.

status

string

Task status at creation time. Always "queued" on successful submission.Possible lifecycle values:

queued — task accepted and waiting in queue
processing — model is actively generating
completed — generation finished; video_url is available
failed — generation failed; see error for details

error

string

Error message. null on successful submission; populated when status is "failed".

Code Examples

curl --request POST \
  --url https://zcbservice.aizfw.cn/kyyReactApiServer/v1/vidu/videos \
  --header 'Authorization: Bearer YOUR_API_KEY' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "viduq3-pro",
    "prompt": "A cat dancing in the rain, cinematic style",
    "duration": 5,
    "aspect_ratio": "16:9",
    "resolution": "1080p"
  }'

Example Response

{
  "id": "video_fd35ee52-2a98-44a6-b930-29a88ce9b8fd",
  "object": "video",
  "created": 1774836724,
  "model": "viduq3-pro",
  "status": "queued",
  "error": null
}

Usage Modes

Text-to-Video

Send only model, prompt, and optional duration/aspect_ratio/resolution. The model generates content entirely from the text description.

Image-to-Video

Include first_image (without last_image). The model anchors the video to your provided first frame and generates a natural continuation.

First/Last Frame Guidance

Include both first_image and last_image. The model creates a smooth transition between your two provided frames — ideal for controlled scene transitions and specific visual storytelling.

All image URLs must be publicly accessible on the internet. Base64-encoded images are not supported.

Next Steps

After receiving the task id, poll the Query Video Task endpoint to check status and retrieve the video_url when generation completes.

Documentation Index

​Base URL

​Endpoint

​Authentication

​Models

​Request Parameters

​Image-to-Video Parameters

​Response Fields

​Code Examples

​Example Response

​Usage Modes

​Text-to-Video

​Image-to-Video

​First/Last Frame Guidance

​Next Steps

Base URL

Endpoint

Authentication

Models

Request Parameters

Image-to-Video Parameters

Response Fields

Code Examples

Example Response

Usage Modes

Text-to-Video

Image-to-Video

First/Last Frame Guidance

Next Steps