Documentation Index
Fetch the complete documentation index at: https://doc.deepwl.cn/llms.txt
Use this file to discover all available pages before exploring further.
Video Model Support Matrix
This table summarizes the primary entry points, common modes, and reference image passing methods for the current video model series.
Model Summary Table
| Model Family | Representative Models | Documentation Page | Recommended Entry Point | Common Modes | Reference Image Passing |
|---|
| Sora | sora-2 | Overview | /v1/videos | Text-to-video, first-frame-to-video | JSON input_reference |
| Veo | veo_3_1, veo_3_1-fast | Overview | /v1/videos | Text-to-video, first and last frames, reference-to-video | JSON input_reference |
| Grok Video | grok-video-3, grok-video-3-pro, grok-video-3-max | Overview | /v1/videos | Text-to-video, first-frame-to-video, first and last frames, reference-to-video | multipart input_reference |
| Domestic Video Models (AIGC) | Vidu-*, Kling-*, jimeng-video-*, GV-*, OS-*, Hunyuan-*, Mingmou-*, Hailuo-*, SV-*, JV-* | Overview | /v1/videos | Text-to-video, image-to-video, reference images, reference videos, first and last frames, motion control, digital human, lip sync, template effects | JSON image / images / input_reference / metadata |
| Seedance-2 | doubao-seedance-2-0-260128, doubao-seedance-2-0-fast-260128 | Overview | /v1/video/generations, assets /v1/seedance/asset/* | Text-to-video, first and last frames, multimodal reference, asset library asset:// | JSON content + metadata |
Family-by-Family Notes
Sora
| Item | Description |
|---|
| Model example | sora-2 |
| Recommended entry point | POST /v1/videos |
| Documentation page | Sora Video Overview |
| Common fields | prompt, size, seconds, input_reference, metadata |
| Reference image | JSON input_reference; the gateway still supports multipart |
| Aspect ratio | Commonly 16:9, 9:16 |
| Duration | Submitted via seconds; the specific available values depend on the current upstream and channel configuration |
Veo
| Item | Description |
|---|
| Model examples | veo_3_1, veo_3_1-fast |
| Recommended entry point | POST /v1/videos |
| Documentation page | Veo Video Overview |
| Common modes | Text-to-video, first and last frames, reference-to-video |
| Reference image | JSON input_reference, converted by the server into the Veo request structure |
| Notes | In reference-to-video mode, the request is prioritized toward a landscape orientation to avoid upstream incompatibility |
Domestic Video Models (AIGC)
| Item | Description |
|---|
| Model examples | Vidu-*, Kling-*, jimeng-video-*, GV-*, OS-*, Hunyuan-*, Mingmou-*, Hailuo-*, SV-*, JV-* |
| Recommended entry point | POST /v1/videos |
| Documentation page | Domestic Video Model Overview |
| Common fields | model, prompt, seconds, duration, size, image, images, metadata |
| Reference image | Supports image, images, input_reference; for advanced scenarios, you can also pass it via metadata.file_infos |
| Typical scenarios | Text-to-video, image-to-video, reference images, reference videos, first and last frames, motion control, digital human, lip sync, template effects |
Grok Video
| Item | Description |
|---|
| Model examples | grok-video-3, grok-video-3-pro, grok-video-3-max |
| Recommended entry point | POST /v1/videos |
| Documentation page | Grok Video Overview |
| Common fields | prompt, seconds, aspect_ratio, size |
| Reference image | Sent via multipart input_reference, supports multiple images |
| Duration rules | grok-video-3-pro is fixed at 10s; grok-video-3-max is fixed at 15s |
| Special mode | Also supports a combined mode of “first-frame-to-video + reference image” |
Seedance-2
| Item | Description |
|---|
| Model examples | doubao-seedance-2-0-260128, doubao-seedance-2-0-fast-260128 |
| Recommended entry point | POST /v1/video/generations; assets POST /v1/seedance/asset/* |
| Documentation page | Seedance-2 Overview |
| Common fields | content[] (text / image_url / video_url / audio_url + role), metadata.duration, metadata.ratio, metadata.resolution |
| Asset reference | Use asset://{assetId} after upload |
| Query | GET /v1/video/generations/{task_id} |
Doubao Seedance
| Item | Description |
|---|
| Model examples | doubao-seedance-1-5-pro_480p, doubao-seedance-1-5-pro_720p, doubao-seedance-1-5-pro_1080p |
| Recommended entry point | POST /v1/videos |
| Documentation page | Domestic Video Model Overview |
| Common fields | prompt, seconds, size |
| Reference image | multipart first_frame_image, last_frame_image |
| Duration rules | The current duration limit is between 4 and 11 seconds |
| Notes | Not suitable for “reference-to-video” mode |
Alibaba wan2.6
| Item | Description |
|---|
| Model examples | wan2.6-t2v:1280*720, wan2.6-t2v:1920*1080, wan2.6-i2v:1280*720, wan2.6-i2v:1920*1080 |
| Recommended entry point | POST /v1/videos |
| Documentation page | Domestic Video Model Overview |
| Common modes | t2v text-to-video, i2v image-to-video |
| Resolution | The model name already includes a fixed resolution tier |
| Reference image | i2v is commonly a single-image input |
Vidu
| Item | Description |
|---|
| Model examples | Vidu-q3-pro, Vidu-q3-turbo |
| Recommended entry point | POST /v1/videos |
| Documentation page | Domestic Video Model Overview |
| Request style | JSON |
| First-frame image | image |
| First and last frames | image + metadata.last_frame_url |
| Reference-to-video | images, commonly up to 3 images |
Kling
| Item | Description |
|---|
| Model examples | Kling-3.0, Kling-3.0-Omni |
| Recommended entry point | POST /v1/videos or the official compatible route /kling/v1/videos/* |
| Documentation page | Domestic Video Model Overview |
| Request style | JSON |
| Common fields | prompt, seconds, metadata.output_config |
| Reference image | image |
| Audio | Can be controlled via metadata.output_config.audio_generation |
Jimeng Video
| Item | Description |
|---|
| Model examples | jimeng-video-3.0, jimeng-video-2.0 |
| Recommended entry point | POST /v1/videos (OpenAI format), POST /v1/video/create (unified video) |
| Documentation page | Domestic Video Model Overview, Jimeng Video Overview |
| Request style | JSON / multipart/form-data |
| Common fields | model, prompt, seconds, size, input_reference (OpenAI format); images, aspect_ratio, size (unified video) |
| Reference image | OpenAI format: input_reference file upload; unified video: images array |
| Integration modes | OpenAI format, unified video, Doubao channel |
| Typical scenarios | Text-to-video, image-to-video, first and last frames to video |
Hailuo
| Item | Description |
|---|
| Model examples | Hailuo-2.3, Hailuo-2.3-fast |
| Recommended entry point | POST /v1/videos |
| Documentation page | Domestic Video Model Overview |
| Request style | JSON |
| Common fields | prompt, seconds, metadata.output_config.resolution |
| Reference image | image |
| Notes | Do not rely on aspect_ratio; it is currently more suitable for text-to-video and first-frame-to-video |
Supported Generation Modes
| Model Family | Text-to-video | First-frame-to-video | First and last frames | Reference-to-video | Audio toggle |
|---|
| Sora | Supported | Supported | Some scenarios depend on upstream | Some scenarios are implemented through multi-image reference | Supported |
| Veo | Supported | Can be implemented via reference image | Supported | Supported | Depends on upstream |
| Grok Video | Supported | Supported | Supported | Supported | Depends on upstream |
| Doubao Seedance | Supported | Supported | Supported | Not recommended | Depends on upstream |
| Alibaba wan2.6 | Supported | i2v supported | Depends on upstream | Depends on upstream | Depends on upstream |
| Jimeng Video | Supported | Supported | Supported | Supported | Depends on upstream |
| Vidu | Supported | Supported | Supported | Supported | Depends on upstream |
| Kling | Supported | Supported | Not currently recommended as a standard capability commitment | Not recommended | Supported |
| Hailuo | Supported | Supported | Not recommended | Not recommended | Depends on upstream |
Recommended Reading
- Grok Video Overview
- Sora Video Overview
- Veo Video Overview
- If you want to implement image-to-video, first confirm whether the target model expects
image, images, input_reference, or first_frame_image / last_frame_image.
- If you are explicitly integrating in Kling’s official format, then refer to this set of routes:
/kling/v1/videos/*.