VEO Video Generation — Create a Task API Reference

Use this endpoint to submit a new video generation task with Google’s VEO model. You can guide generation in one of two mutually exclusive modes: reference image mode (provide up to 3 stylistic reference images) or first/last frame mode (pin the exact start and end frames of the video). The API responds immediately with a task id — you then poll the Query Video Task endpoint until the video is ready.

Video generation is asynchronous. The create endpoint returns a task id with status: "queued". Use the Query Video Task endpoint to check progress and retrieve the final video_url.

Base URL & Endpoint

Base URL:  https://zcbservice.aizfw.cn/kyyReactApiServer
Endpoint:  POST /v1/veo/videos
Full URL:  https://zcbservice.aizfw.cn/kyyReactApiServer/v1/veo/videos

All requests must include your API key as a Bearer token:

Authorization: Bearer YOUR_API_KEY

Supported Models

Model	Resolutions	Reference Images	First/Last Frame	Notes
`veo_3_1_fast`	720p · 1080p · 4K	✅ up to 3	✅	Fast generation
`veo_3_1_pro`	720p · 1080p · 4K	✅ up to 3	✅	Highest quality
`veo_3_1_pro_stable`	720p · 1080p	✅ up to 3	✅	Stable variant
`veo_3_1_fast_stable`	720p · 1080p	❌	✅ only	First/last frame only

veo_3_1_fast_stable does not support reference image mode — it only accepts first/last frame inputs. Additionally, neither veo_3_1_fast_stable nor veo_3_1_pro_stable support 4K resolution.

Request Parameters

model

string

required

The VEO model variant to use. See the models table above for capability details.Accepted values: veo_3_1_fast, veo_3_1_pro, veo_3_1_pro_stable, veo_3_1_fast_stable

prompt

string

required

A text description of the video you want to generate.Example: "A cat dancing in the rain, cinematic style"

resolution

string

Output video resolution. Defaults to 720p.

720p — supported by all models
1080p — supported by all models
4K — only veo_3_1_fast and veo_3_1_pro

The value is case-insensitive (720p, 720P, 4K, and 4k are all accepted).

aspect_ratio

string

Output video aspect ratio. Defaults to 16:9.

16:9 — landscape
9:16 — portrait

Specify resolution separately using the resolution field. Do not append resolution suffixes (e.g. -1080p) to this field.

Reference Image Mode

Provide up to 3 reference images to guide the overall style and content of the generated video. Mutually exclusive with first/last frame mode.

input_reference

array

Array of reference image URLs. Maximum 3 images.Supported models: veo_3_1_pro_stable (max 3), veo_3_1_fast (max 3), veo_3_1_pro.Use this or image_urls — not both. Cannot be combined with first_image / last_image.

image_urls

array

Array of reference image URLs. Maximum 3 images.Supported models: veo_3_1_pro_stable (max 3), veo_3_1_fast (max 3), veo_3_1_pro.Use this or input_reference — not both. Cannot be combined with first_image / last_image.

First / Last Frame Mode

Pin the exact starting and/or ending frame of the generated video. Mutually exclusive with reference image mode. All four models support this mode.

first_image

string

URL of the image to use as the first frame of the video.

Used alone: image-to-video mode (video starts from this frame)
Used with last_image: first/last frame mode (video starts and ends at the specified frames)

Cannot be combined with input_reference / image_urls.

last_image

string

URL of the image to use as the last frame of the video. Must be used together with first_image.Cannot be combined with input_reference / image_urls.

Response Fields

string

Unique identifier for the video generation task. Save this value — you will use it to poll the Query Video Task endpoint.

object

string

Object type. Always "video".

created

integer

Unix timestamp (seconds) of when the task was created.

model

string

The model name you specified in the request.

status

string

Initial task status. On successful creation this is always "queued".

error

string

Error message. Only present when status is "failed".

Code Examples

curl --request POST \
  --url https://zcbservice.aizfw.cn/kyyReactApiServer/v1/veo/videos \
  --header 'Authorization: Bearer YOUR_API_KEY' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "veo_3_1_fast",
    "prompt": "A cat dancing in the rain, cinematic style",
    "aspect_ratio": "16:9",
    "resolution": "1080p",
    "input_reference": [
      "https://example.com/reference1.jpg",
      "https://example.com/reference2.jpg"
    ]
  }'

Example success response:

{
  "id": "vgen_abc123def456",
  "object": "video",
  "created": 1761635478,
  "model": "veo_3_1_fast",
  "status": "queued"
}

Guidance Mode Comparison

Mode	When to use	Supported models
Reference image mode	You want the video to match an overall visual style	`veo_3_1_pro_stable`, `veo_3_1_fast`, `veo_3_1_pro`
First/last frame mode	You need precise control over the opening and closing frames	All four models

The two guidance modes are mutually exclusive. You cannot combine reference images (input_reference / image_urls) with first/last frame parameters (first_image / last_image) in the same request.

After creating a task, poll the Query Video Task endpoint every 30–60 seconds. Reference image mode typically completes in 2–5 minutes; first/last frame mode typically takes 3–5 minutes.

Documentation Index

​Base URL & Endpoint

​Supported Models

​Request Parameters

​Reference Image Mode

​First / Last Frame Mode

​Response Fields

​Code Examples

​Guidance Mode Comparison

Base URL & Endpoint

Supported Models

Request Parameters

Reference Image Mode

First / Last Frame Mode

Response Fields

Code Examples

Guidance Mode Comparison