MCP API Integration Guide
How to create videos with AI agents like Claude Desktop, Claude Code, and Cursor via MCP
What is MCP?
MCP (Model Context Protocol) is a standard protocol that allows AI agents (Claude, GPT, Cursor, etc.) to directly call tools from external services. By connecting to the Autoagens MCP server, you can create videos through conversation with AI.
Getting Started
- Sign up and generate an API Key from Dashboard > Settings > API Keys.
- Copy the configuration below for your AI client and paste it into the settings file.
- Tell your AI “Create a video for me” and you're done!
Connection Info
| Item | Value |
|---|---|
| MCP Endpoint | https://autoagens.com/api/mcp |
| Transport | Streamable HTTP |
| Auth | Authorization: Bearer ak_YOUR_API_KEY |
Connect with Claude Desktop
Open the Claude Desktop config file and add the following:
- Mac:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"autoagens-video": {
"type": "url",
"url": "https://autoagens.com/api/mcp",
"headers": {
"Authorization": "Bearer ak_YOUR_API_KEY"
}
}
}
}
Replace ak_YOUR_API_KEY with your actual API key. Save and restart Claude Desktop to connect.
Connect with Claude Code (CLI)
Run this command in your terminal:
claude mcp add autoagens-video \
--transport http \
https://autoagens.com/api/mcp \
--header "Authorization: Bearer ak_YOUR_API_KEY"
Connect with Cursor
Create a .cursor/mcp.json file in your project root:
{
"mcpServers": {
"autoagens-video": {
"type": "url",
"url": "https://autoagens.com/api/mcp",
"headers": {
"Authorization": "Bearer ak_YOUR_API_KEY"
}
}
}
}
Windsurf / Other MCP Clients
Any client supporting MCP Streamable HTTP can connect:
URL: https://autoagens.com/api/mcp
Method: POST
Header: Authorization: Bearer ak_YOUR_API_KEY
Header: Content-Type: application/json
Header: Accept: application/json, text/event-stream
Verify Connection
After connecting, try telling your AI:
- “Read the video-guide prompt” — to see the full usage guide
- “List available tools” — to see all 11 tools
Available Tools
Workflow Management
| Tool | Description |
|---|---|
create_workflow | Create a new video workflow |
save_workflow | Save nodes and edges to a workflow |
list_workflows | List your workflows |
Assets & Images
| Tool | Description | Cost |
|---|---|---|
upload_asset | Upload an external image URL or base64 as a stable asset. Reuse the returned assetId across tools | Free |
generate_image | AI image generation (text/image based). Automatically saved as asset with assetId | 100 coins |
Video Generation
| Tool | Description | Cost |
|---|---|---|
create_clip | Create video clip (async). Prompt required for ALL modes. Supports video continuation for scene continuity | 2,500~7,500 coins |
compose_video | Compose clips into final video (async) | 500 coins |
Status & Frame Extraction
| Tool | Description |
|---|---|
get_job_status | Check clip/compose job status |
extract_frame | Extract a frame from a completed clip. For multi-scene chaining (sync, 5-15s) |
Templates
| Tool | Description |
|---|---|
list_templates | List public templates |
use_template | Copy template to your workflows |
Supported AI Video Models
| Model | Modes | Duration | Features |
|---|---|---|---|
veo3-fast | T2V | 5-8s | Google Veo 3 fast, seed/negative_prompt |
veo3 | T2V | 5-8s | Google Veo 3, highest quality |
kling-v3-pro | I2V, IT2V | 3-15s | Character refs/endImage/negative_prompt/video continuity |
kling-o3 | I2V, IT2V | 3-15s | High quality, character refs/endImage/negative_prompt/video continuity |
kling-v2.6-pro | I2V, IT2V | 5 or 10s | endImage/negative_prompt |
ltx-2.3 | I2V, IT2V | 6-20s (even only) | Fast generation, endImage |
Advanced Options (model-specific)
| Feature | Description | Supported Models |
|---|---|---|
negativePrompt | Specify unwanted elements | veo3, kling models |
seed | Reproduce identical results | veo3-fast, veo3 |
characterReferences | Character consistency (frontal + reference images) | kling-v3-pro, kling-o3 |
endImageUrl | Specify end frame image | kling, ltx models |
Video Continuity Options
For multi-scene production, reference the previous scene's video to enhance motion, camera, and scene layout continuity.
| Parameter | Description |
|---|---|
videoUrl / referenceVideoUrl | Previous clip video URL for motion/camera continuity reference |
previousClipTaskId | Completed previous clip taskId. Server auto-resolves its videoUrl |
continueMotion | Continue motion state from previous clip (default true with reference video) |
preserveCamera | Preserve camera direction/movement (default true with reference video) |
preserveSceneLayout | Preserve scene layout/blocking (default true with reference video) |
The backend currently processes continuity via frame extraction + prompt augmentation. Korean prompts are automatically translated to English before being sent to the model.
Video Creation Flow
Just tell your AI “Create a 30-second ad video” and it handles everything automatically. Under the hood:
- Create workflow — Start with
create_workflow - Prepare reference images — Use
upload_assetto import external images as stable assets (free) - Generate images (optional) — Create source images with
generate_image(100 coins) - Create clips —
create_clipfor each scene. Prompt required for all modes (async, 1-3 min) - Check status — Poll
get_job_statusevery 3-5 seconds - Compose video — Combine all clips with
compose_video(2-5 min) - Download result — Get the final video URL
Asset-based Workflow
The assetId returned by upload_asset or generate_image is a stable reference to an image stored on Autoagens servers. Unlike external URLs that may expire, an assetId is always valid.
Multi-Scene Chaining
Passing just a single frame creates visual continuity but may lack motion/camera/story continuity. For best results, pass both the end frame and the previous video reference.
- Scene 1
create_clip→ wait for completion extract_frame(taskId: "scene1_taskId", position: "end")→ getframeUrlget_job_status(jobId: "scene1_taskId")→ get previous scenevideoUrl- Scene 2
create_clipwith:imageUrl: frameUrl— visual continuityreferenceVideoUrl: "scene1_videoUrl"— motion/camera continuityprompt: "next scene description"— required
- Repeat for all scenes, then
compose_video
Or use previousClipTaskId for automatic videoUrl resolution:
create_clip({
imageUrl: "frameUrl",
previousClipTaskId: "scene1_taskId",
prompt: "next scene description",
model: "kling-v3-pro",
duration: 5,
clipId: "scene-02"
})
Pricing
| Action | Cost |
|---|---|
| Upload asset | Free |
| Generate image | 100 coins |
| Extract frame | Free |
| Create clip (5s) | 2,500 coins |
| Create clip (10s) | 5,000 coins |
| Create clip (15s) | 7,500 coins |
| Compose video | 500 coins |
Failed jobs are automatically refunded. Admin users can use all tools for free.
FAQ
Where do I get an API Key?
Generate one from Dashboard > Settings > API Keys. Keys start with ak_.
Which AI clients are supported?
Any client that supports MCP Streamable HTTP. Tested with Claude Desktop, Claude Code, Cursor, and Windsurf.
What is the difference between upload_asset and generate_image?
upload_asset stores an existing image on Autoagens (free). generate_image creates a new AI image and auto-saves it (100 coins). Both return an assetId.
What is extract_frame for?
It extracts a frame from a completed video clip. Used in multi-scene chaining to pass the last frame of the previous scene as the start image for the next scene.
How do I create smooth scene transitions?
Use extract_frame to get the end frame, then pass both imageUrl (the frame) and referenceVideoUrl or previousClipTaskId (the video) to the next create_clip. This provides both visual and motion/camera continuity.
Can I use Korean prompts?
Yes, both Korean and English are supported. Korean prompts are automatically translated to English before being sent to the model.
Where can I buy coins?
Visit Dashboard > Coins to purchase coins.