Canvas-Guide
Canvas Mode: Visual Workflow Editor
Canvas Mode is a node-based visual workflow editor that lets you build multi-step AI generation pipelines by connecting nodes with wires. Instead of chatting with an agent, you directly design the data flow — from inputs (text, images, audio, video) through AI generation nodes — to produce your final output.
Keywords: Visual, Pipeline, Node Graph, Multi-Step
Canvas Mode is ideal when you want fine-grained control over each generation step, need to chain multiple AI models together, or want to reuse the same workflow repeatedly with different inputs.
Getting Started
1. Open Canvas Mode
Navigate to /canvas in your browser. You'll see the Canvas list page with a mode toggle at the top.
Click "New Canvas" to create a blank canvas and enter the editor.
2. The Editor Interface
The editor is a full-screen node graph with the following controls:
- Top-left toolbar — Canvas title (click to edit) and save status indicator
- Bottom-left controls — Zoom in/out, fit view, minimap
- Double-click on canvas — Open the node menu to add new nodes
- Share button — Copy a read-only share link for the current canvas
Node Types
Canvas provides 8 node types organized into two categories: Input nodes (provide data) and Generation nodes (process data with AI).
Input Nodes
- Text Input — Enter free-form text to use as prompts or instructions
- Image Upload — Upload an image file as input for downstream nodes
- Audio Upload — Upload an audio file (e.g. for lip-sync or audio-driven video)
- Video Upload — Upload a video file as reference or source material
Generation Nodes
- Image Generation — Generate images using AI models (Nano Banana 2, Nano Banana Pro, Seedream, etc.)
- Video Generation — Generate videos using AI models (Kling 3, Veo 3, Sora 2, Seedance, etc.)
- Text Generation — Generate text using LLM (e.g. expand a short idea into a detailed prompt)
- Image Angles — Adjust the viewing angle of an image (horizontal, vertical, zoom)
Building Your First Workflow
Let's walk through a simple example: generating a video from a text prompt.
Step 1: Add a Text Input Node
Double-click on the canvas and select Text Input. Type your prompt in the text area, for example:
A golden retriever running on a sunny beach, slow motion, cinematic lighting
Step 2: Add a Video Generation Node
Double-click on the canvas again and select Video Generation. A new node appears with model selection and parameters.
Choose your preferred model (e.g. Kling 3) and configure:
- Aspect Ratio — 16:9, 9:16, or 1:1
- Duration — 3s to 15s depending on the model
- Mode — Standard or Pro (Pro costs more credits but produces higher quality)
Step 3: Connect the Nodes
Drag from the output handle (right side) of the Text Input node to the input handle (left side) of the Video Generation node. A wire appears connecting them.
The text you entered will automatically flow into the video node's prompt field.
Step 4: Run the Node
Click the Run button on the Video Generation node. The node enters a loading state and polls for task completion (every 3 seconds).
Once complete, the generated video appears directly in the node. You can preview it inline or download it.
Advanced Workflows
Chaining Multiple Nodes
You can chain nodes to build multi-step pipelines. Here are some common patterns:
Text → Image → Video (Image-to-Video)
- Text Input — Write an image description
- Image Generation — Generate a still image from the text
- Video Generation — Animate the generated image into a video
Text → Text → Image (Prompt Expansion)
- Text Input — Write a short idea ("a cyberpunk city")
- Text Generation — Expand into a detailed prompt using LLM
- Image Generation — Generate an image from the expanded prompt
Image + Text → Video (Controlled Video Generation)
- Image Upload — Provide a reference frame
- Text Input — Describe the desired motion
- Video Generation — Both inputs flow into the video node: the image as the start frame, the text as the motion prompt
End Frame Support
Some video models (Kling 3, Veo 3) support end frame — you can connect a second image to define where the video should end. This gives you precise control over the motion trajectory.
Data Flow Rules
Connections follow a color-coded type system:
- Text (Blue) — Flows into
promptfields - Image (Green) — Flows into
image_urlsorend_image_urlfields - Video (Orange) — Flows into
video_urlfields - Audio (Purple) — Flows into
audio_urlfields
Upstream node outputs automatically propagate to downstream node inputs when connected.
Available Models
Image Generation Models
- Nano Banana 2 — Fast, versatile image generation with extended aspect ratios and up to 4K resolution
- Nano Banana Pro — Higher quality variant with the same flexibility
- Seedream 4.5 / 5 Lite — High-quality image generation
Video Generation Models
- Kling 3 — Versatile video generation, 3–15s duration, supports end frame and multi-shot
- Veo 3 Fast — Google's video model, supports end frame
- Sora 2 — OpenAI's video model, 10–15s duration, standard and pro modes
- Grok — xAI's video model
Note: Each model costs different credits. The credit cost is displayed on the node before you run it.
Auto-Save and Sharing
- Auto-save — Your canvas is automatically saved 2 seconds after any change (node moves, parameter edits, new connections). The "Saving..." indicator appears in the toolbar.
- Share — Click the Share button in the toolbar to copy a read-only link (
/canvas_view/{id}) that anyone can view without editing. - Canvas list — Return to
/canvasto see all your saved canvases, sorted by last update time.
Tips
- Double-click to add nodes — The fastest way to add nodes is double-clicking anywhere on the canvas.
- Drag to connect — Drag from any output handle to an input handle to create a connection. Invalid connections (incompatible types) will not be created.
- Delete nodes — Select a node and press
DeleteorBackspaceto remove it. - Delete connections — Click on a wire to select it, then press
Deleteto remove the connection. - Pan and zoom — Scroll to zoom, drag the background to pan. Use the minimap in the bottom-left for orientation.
- Reuse workflows — Since canvases are saved automatically, you can revisit a canvas, change the input text or images, and re-run the pipeline to generate new variations.
