Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.xingchaoyiqing.com/llms.txt

Use this file to discover all available pages before exploring further.

Use this endpoint to submit a video generation task with Seedance 2.0, ByteDance’s latest generation video model. Seedance 2.0 supports a rich set of multi-modal reference inputs — you can guide generation using reference images, reference videos, reference audio clips, or by pinning the first and last frames. Two model tiers are available (standard and Pro), each with a fast variant. The API responds immediately with a task id — you then poll the Query Video Task endpoint until the video is ready.
Video generation is asynchronous. The create endpoint returns a task id with status: "queued". Use the Query Video Task endpoint to check progress and retrieve the final video_url.

Base URL & Endpoint

Base URL:  https://zcbservice.aizfw.cn/kyyReactApiServer
Endpoint:  POST /v1/kyyvideo2/videos
Full URL:  https://zcbservice.aizfw.cn/kyyReactApiServer/v1/kyyvideo2/videos
All requests must include your API key as a Bearer token:
Authorization: Bearer YOUR_API_KEY

Supported Models

ModelReference Video RequiredSpeedQuality
seedance_2_0✅ RequiredStandardStandard
seedance_2_0_fast✅ RequiredFastStandard
seedance_2_0_pro⭕ OptionalStandardHigh
seedance_2_0_fast_pro⭕ OptionalFastHigh
seedance_2_0 and seedance_2_0_fast require at least one reference video (referenceVideos). seedance_2_0_pro and seedance_2_0_fast_pro work without reference videos, making them suitable for pure text-to-video and first/last frame workflows.

Request Parameters

model
string
required
The Seedance 2.0 model to use. Accepted values: seedance_2_0, seedance_2_0_fast, seedance_2_0_pro, seedance_2_0_fast_pro.
prompt
string
required
A text description of the video you want to generate. Be specific about scene, style, motion, and lighting.Example: "A cat dancing in the rain, cinematic style"
duration
integer
Output video duration in seconds. Range: 415. Default: 5.Longer durations increase generation time proportionally.
aspect_ratio
string
Output video aspect ratio. Default: 16:9.Accepted values: 16:9, 4:3, 1:1, 3:4, 9:16, 21:9, adaptive.
generateAudio
boolean
Whether to include synthesised audio in the output video. Default: true.
  • true — output video includes synchronised audio
  • false — output video is silent

First / Last Frame Mode

Pin the exact start and/or end frames of the generated video. Mutually exclusive with referenceImages, referenceVideos, and referenceAudios.
first_image
string
The first frame of the video. Accepts a public URL or an asset library reference in the format asset://ASSET_ID.
  • Used alone: image-to-video mode (video starts from this frame)
  • Used with last_image: first/last frame mode
Image requirements:
  • Formats: JPEG, PNG, WebP, BMP, TIFF, GIF
  • Aspect ratio (width ÷ height): 0.4–2.5
  • Dimensions: 300–6000 px per side
Cannot be combined with referenceImages, referenceVideos, or referenceAudios.
last_image
string
The last frame of the video. Must be used together with first_image. Same format and dimension requirements as first_image.Cannot be combined with referenceImages, referenceVideos, or referenceAudios.

Multi-Modal Reference Mode

Provide reference media to guide generation style and content. Mutually exclusive with first_image and last_image.
referenceImages
array
A list of reference image URLs or asset library references (asset://ASSET_ID) to guide visual style.
  • Count: 1–9 images
  • Formats: JPEG, PNG, WebP, BMP, TIFF, GIF
  • Aspect ratio: 0.4–2.5
  • Dimensions: 300–6000 px per side
Cannot be combined with first_image or last_image.
referenceVideos
array
A list of reference video URLs or asset library references (asset://ASSET_ID) to guide motion and content.
  • Count: max 3 videos
  • Resolutions: 480p or 720p
  • Duration per clip: 2–15 seconds
  • Total combined duration: ≤ 15 seconds
The system automatically selects the optimal backend model based on your reference videos.Cannot be combined with first_image or last_image.
referenceAudios
array
A list of reference audio URLs or asset library references (asset://ASSET_ID) to guide the generated audio track.
  • Count: max 3 clips
  • Formats: WAV, MP3
  • Duration per clip: 2–15 seconds
  • Total combined duration: ≤ 15 seconds
  • File size per clip: ≤ 15 MB
Cannot be combined with first_image or last_image.

Input Mode Mutual Exclusivity

The two input guidance modes are mutually exclusive. You cannot use both in the same request:
  • First/last frame mode: first_image, last_image
  • Multi-modal reference mode: referenceImages, referenceVideos, referenceAudios

Response Fields

id
string
Unique identifier for the video generation task. Save this value — you will use it to poll the Query Video Task endpoint.
object
string
Object type. Always "video".
created
integer
Unix timestamp (seconds) of when the task was created.
model
string
The model name you specified in the request.
status
string
Initial task status. On successful creation this is always "queued".
error
string
Error message. Only present when status is "failed".

Code Examples

curl --request POST \
  --url https://zcbservice.aizfw.cn/kyyReactApiServer/v1/kyyvideo2/videos \
  --header 'Authorization: Bearer YOUR_API_KEY' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "seedance_2_0_pro",
    "prompt": "A cat dancing in the rain, cinematic style",
    "duration": 8,
    "aspect_ratio": "16:9",
    "generateAudio": true
  }'
Example success response:
{
  "id": "video_fd35ee52-2a98-44a6-b930-29a88ce9b8fd",
  "object": "video",
  "created": 1774836724,
  "model": "seedance_2_0_pro",
  "status": "queued",
  "error": null
}

Tips

Use seedance_2_0_pro or seedance_2_0_fast_pro for text-to-video and first/last-frame tasks — these models work without a mandatory reference video, giving you the most flexibility.
If you are using reference images that contain human faces or virtual avatars, you must upload them to the asset library first for content review. Use asset://ASSET_ID references rather than direct URLs in those cases.
Keep your reference videos short (2–5 seconds each) and under the 15-second total cap. The system selects the optimal backend model automatically based on your reference video content.