Arcads external API
Configuration
- Base URL:
https://external-api.arcads.ai
(or ).
- Auth: HTTP Basic — use as the username and an empty password unless Arcads documentation for your key specifies otherwise. Example curl:
curl -u "$ARCADS_API_KEY:" "$ARCADS_BASE_URL/v1/products"
.
- Never print API keys, commit , or paste keys into .
If the key is missing or the API returns 401/403
- Editor-first (default): Ensure exists (copy from in the repo root). Ask the user to paste only inside and save. Do not ask them to paste the key in chat unless they insist.
- Chat-assisted: If they paste the key in chat, write for them, confirm "saved to " without repeating the key, and remind them that chat history may retain secrets—rotate the key in Arcads if the chat could be shared.
Before the first call, confirm
excludes
.
Read order
- Repo root when present (brand voice, decisions, quirks).
- This skill's reference.md for routes, bodies, polling.
- prompting/guide.md then the right
prompting/prompt-library/
file for the model (see table below).
Decision tree: which flow?
All video models use
with the appropriate
value (see
reference.md for the full
schema).
| User goal | Start here | Prompt library |
|---|
| Seedance 2.0 UGC video — selfie-style product review / testimonial | with | seedance-2.md (platform guide) + seedance-2-ugc.md (9-layer UGC formula) |
| Seedance 2.0 premium product reveal — dark-void, no person, text narrative | with | seedance-2.md + seedance-2-premium-reveal.md |
| Seedance 2.0 product hero — elemental effects, no person, splash/mist | with | seedance-2.md + seedance-2-product-hero.md |
| Seedance 2.0 studio lookbook — polished, voiceover, multi-look | with | seedance-2.md + seedance-2-studio-lookbook.md |
| Seedance 2.0 feature walkthrough — fast-paced feature demo | with | seedance-2.md + seedance-2-feature-walkthrough.md |
| Reverse-engineer a video style into a reusable Seedance 2.0 template | Follow the analyze-video skill | prompting/analyze-video/SKILL.md |
| Clone/replicate an existing video ad for a different product | Follow the clone-ad skill | prompting/clone-ad/SKILL.md |
| Raw Sora 2 video from text (plus product) | with | prompt-library/sora-2.md |
| Sora remix of an existing asset | POST /v1/sora2/remix/video
| sora-2.md |
| Veo 3.1 video | with | prompt-library/veo-3-1.md |
| Kling 3.0 video | with | kling-3.md |
| Grok Video | with | See reference.md for fields |
| Nano Banana still image (standalone or as starting frame for video) | with by default; optional (Nano Banana Pro) | nano-banana.md |
| B-roll clip (product-level) | | kling-3.md or nano-banana.md for craft; see reference.md for Kling/Nano routing notes |
| Scene generation | | Same as b-roll row |
| Recreate an influencer from a reference photo | Two-step: (1) with to generate a still image via Nano Banana, get user approval; (2) upload approved still → with and for video. Never skip the approval step. | prompt-library/influencer-recreation.md |
| Product showcase — AI person holds/uses a product and talks about it | Two-step: (1) with product ; (2) user approves still; (3) start-frame → video via . | prompt-library/product-showcase.md |
| UGC / selfie-style (authentic reels, cross-model) | Any video model via | prompt-library/ugc-selfie-style.md — cross-model UGC guide. For Seedance 2.0 specifically, use seedance-2-ugc.md instead. |
| Create a new AI influencer from text (character sheet) | Two-pass: (1) hero portrait via , get approval; (2) 9 angles with hero as . Save to . | prompt-library/character-sheet.md |
| UGC product selfie — AI influencer holding a product | Combine character hero + product photo + style references as . | prompt-library/ugc-product-selfie.md |
| Pixar-style 3D animated ad — anthropomorphized cartoon ad with mascot beats | Multi-step: (1) Lock cast sheet; (2) ChatGPT Image 2 storyboard stills via with (max 5 ); (3) Seedance 2.0 image-to-video per beat via with and from each still; (4) ffmpeg-stitch + burn captions. | ../../shared/skills/pixar-style-ad/prompting/guide.md → storyboard-gpt-image-2.md + animate-seedance-2.md |
| Claymation / Aardman-style ad — sculpted plasticine characters, narrator-driven 8-beat story arc, 60–115s | Multi-step: (1) Lock cast sheet (protagonist + supporting character + narrator voice); (2) ChatGPT Image 2 storyboard stills via with (max 5 ) — fallback to (Pro) for close-ups if clay texture flattens; (3) Seedance 2.0 image-to-video per beat via with ; (4) ffmpeg-stitch (optional for stop-motion judder) + burn captions. | ../../shared/skills/claymation-ad/prompting/guide.md → storyboard-gpt-image-2.md + animate-seedance-2.md |
| Add captions to a finished video — burn timed narrator/dialogue captions onto an existing MP4 (any source — claymation, pixar, UGC, B-roll) | Out of band (no Arcads API call). Multi-step: (1) npx hyperframes init <run-id>-captions
; (2) npx hyperframes transcribe source.mp4 --model medium.en
(NOT if there's background music); (3) group word-level transcript into reading phrases; (4) write captions-only HTML over magenta bg — never include or elements (causes black-bar bug); (5) then ffmpeg chromakey=0xff00ff:0.10:0.05
overlay onto source. | ../../shared/skills/caption-video/prompting/guide.md |
| Talking avatar / script (actors, voices) | , POST /v1/scripts/{id}/generate
| prompting/guide.md |
| OmniHuman | | prompting/guide.md |
| Audio-driven | | prompting/guide.md |
Prefer the shortest path: if the user only needs a single model, do not create scripts unless they ask for actors/lip-sync workflows.
Creative layer
- MANDATORY: Before composing any prompt for the API, read the relevant
prompting/prompt-library/*.md
file for the chosen model/workflow. Do NOT skip this step — every prompt must align with the vendor guide's formula and best practices.
- Build one clear prompt paragraph; avoid keyword soup.
- For Seedance 2.0 / Sora2 / Veo3.1 / Kling / Grok Video / Nano Banana, align with the official vendor guides linked in each
prompting/prompt-library/*.md
file (do not paste full vendor docs into chat—summarize checks).
- Merge slot values from the user and from when it conflicts with defaults.
Session setup: auto-create a dated folder
At the start of each session that will generate assets, create a folder and project for the day so everything is organized in the Arcads dashboard:
- Get today's date as .
- → pick the target product (default to whichever specifies under "My workspace"). If no default is set: if only one product exists, auto-populate with its ID and name; if multiple, ask the user to pick and save their choice to .
- Check existing folders (
GET /v1/products/{productId}/folders
) — if "Arcads API - {today}" already exists, reuse it. Otherwise:
- with
{"productId": "...", "name": "Arcads API - YYYY-MM-DD"}
.
- with
{"productId": "...", "folderId": "...", "name": "Arcads API - YYYY-MM-DD"}
.
- Store the for the session and pass it in every generation call ( field on Sora2/Veo31/b-roll/scene/image DTOs) and use
POST /v1/assets/add-to-project
after generation for asset types that do not accept directly.
This ensures every generated asset is findable in the Arcads dashboard under Product → "Arcads API - {date}".
Credit cost estimation (MANDATORY — show before generating)
Before firing any generation calls, calculate and present the total credit cost to the user as an estimate. Do not generate until the user confirms.
ALWAYS label credit totals as estimates and tell the user to confirm the exact cost in the Arcads platform before generating if precision matters. The Arcads API does not expose billing endpoints; pricing varies by duration, resolution, and reference inputs.
Cost data sources (in priority order)
- — historical record of actual values for every previous call. Read this first. Grep for entries with the same and similar config (same , , , ) and use the recorded as the estimate. This is the most accurate source.
- → Credit costs — user-provided pricing rules (e.g. "Seedance 2.0 image-to-video ≈ 0.06/sec"). Use when no matching log entry exists.
- Ask the user — if neither source has data for the config, ask the user and write the answer into .
Never invent numbers. Always cite the source of the estimate ("based on log entry from YYYY-MM-DD" or "from MASTER_CONTEXT.md rate table").
How to calculate
total_credits ≈ sum(credits_per_model × variations_requested) for each model
Example output to user
Estimated credit cost:
Seedance 2.0 (15s i2v) × 1 = ~0.9 credits (from logs/arcads-api.jsonl 2026-04-09)
Veo 3.1 × 2 = ~8 credits (from MASTER_CONTEXT.md)
─────────────────────────────
Estimated total: ~8.9 credits
⚠️ Estimate only — confirm exact cost in the Arcads platform before proceeding.
Proceed? (yes/no)
Always wait for confirmation before firing. If the user has a credit balance visible in
, warn them if the total would exceed it. If neither the logs nor
have data for the config, ask the user before the first generation and save the answer.
Exception — QA-fix retries (still images only): After the user has confirmed the initial batch,
automatic regeneration to fix visible defects (see
Generated image QA below) does
not require asking again for credit confirmation. Each retry is still billed — note the extra
when summarizing the session.
Generation count: multiple variations per prompt
Before firing any generation call, ask the user how many variations they want for this prompt. Default is 1 if they don't specify.
When the count is greater than 1, send N separate API calls with the identical payload. Do NOT batch them into a single request — the API has no batch parameter. Fire them in parallel where possible, then poll all asset IDs concurrently.
Present results as a numbered list so the user can compare and pick favorites.
Nano Banana image: model choice ( vs Nano Banana Pro)
For
when using a Nano Banana engine:
- Default: (Nano Banana 2).
- Optional: when the user asks for Nano Banana Pro (the API has no enum — Pro maps to ; see nano-banana.md).
Before the first Nano Banana image call in a workflow, ask:
"Use default Nano Banana 2, or Nano Banana Pro?" If they have no preference, use
. Include the chosen
in the credit estimate (separate rows in
if pricing differs).
Script and dialogue
For any video that features a person speaking, ask the user for the script (the exact words the AI person should say). This is separate from the visual prompt — it's the dialogue.
MANDATORY — dialogue confirmation gate
Before generating any video that contains spoken dialogue, the agent MUST:
- Extract the dialogue lines from the full prompt and show them to the user in a dedicated block, separate from the visual/cinematography description.
- Present them as a clean, numbered list with beat labels (hook / show / demo / verdict, or similar) and any silent beats clearly marked as
(silent beat — no dialogue)
.
- Read the dialogue out loud in your head at a natural pace, time it against the target duration, and flag the total spoken word count plus whether it comfortably fits.
- Explicitly ask for dialogue approval before moving on — e.g. "Approve this dialogue? (yes / edit / rewrite)". Never assume approval from earlier confirmations (tone, template, credit cost). Dialogue approval is its own gate.
- Only after the user types (or equivalent) may you proceed to the credit cost confirmation and then generation. If the user says "edit" or proposes changes, revise and re-present the numbered dialogue block until they approve.
Presentation format (use this exact structure):
📝 Dialogue script (please confirm before I generate)
1. [HOOK] "Bro. BRO. Look what just showed up."
2. [SHOW] "The PAID SOCIAL stripe? Insane. Like, who greenlit this?"
3. [DEMO] (silent beat — thumb brushing the suede, small nod)
4. [VERDICT] "I'm literally wearing these to the gym tomorrow. You guys have to see these in person."
Total spoken words: ~28 | Target duration: 15s | Fits at natural pace: ✅
Approve this dialogue? (yes / edit / rewrite)
This gate applies to Seedance 2.0, Veo 3.1, Sora 2, and Scene — any flow where the model speaks. Skip for silent flows (B-roll, pure product-hero, premium-reveal with no voiceover, Nano Banana images).
Model-specific notes
- For Seedance 2.0, Veo 3.1, and Sora 2: embed the dialogue in the field using a or pattern (these models generate speech from the text prompt).
- For Seedance 2.0 specifically: before generating, always ask the user whether to enable audio output (). Also ask whether they want to supply (e.g. background music or a specific voice clip). Upload audio files via presigned URL if provided.
- For Scene (): use the dedicated field for dialogue and for visuals.
- For B-roll: no speech — b-roll is silent/ambient by nature. If the user wants speech, redirect to Seedance 2.0, Veo 3.1, Sora 2, or Scene.
- For Nano Banana images: no speech — these are still images. Speech is handled in the subsequent video generation step.
Script length → video duration (auto-select)
Use the script's word count to automatically pick the best
value. Average speaking pace:
~2.5 words per second (~150 WPM). Round
up to the next available duration to give breathing room.
Sora 2 — duration enum: seconds
| Script length | Duration |
|---|
| 1–8 words | 4s |
| 9–18 words | 8s |
| 19–28 words | 12s |
| 29–38 words | 16s |
| 39–48 words | 20s |
| 49+ words | Too long — offer to split (see below) |
Veo 3.1 — no field
Veo 3.1 auto-determines video length (~8s typical). If the script exceeds ~20 words, warn the user that Veo may truncate dialogue and offer to split or switch to Sora 2 which has longer duration options.
Seedance 2.0 — duration: 4–15 seconds (continuous)
Seedance 2.0 supports any integer from 4 to 15. Use ~2.5 words/second, round up to the nearest second.
| Script length | Duration |
|---|
| 1–8 words | 4–5s |
| 9–15 words | 6–8s |
| 16–25 words | 9–12s |
| 26–35 words | 13–15s |
| 36+ words | Too long — offer to split into multiple clips |
For no-dialogue styles (product hero, premium reveal), default to 15s.
Resolution: Default to
. Only use
if the user asks for a faster/cheaper test generation.
Aspect ratio: (vertical, default for UGC/social) or
(landscape). No
support.
B-roll (Kling 3.0) — duration enum: seconds
B-roll is typically wordless. If the user insists on a timed clip with context:
| Script length | Duration |
|---|
| 1–12 words | 5s |
| 13–24 words | 10s |
| 25+ words | Too long — redirect to Sora 2 / Veo 3.1 for speech |
Scene — no field
Scene auto-determines length. Use the
field for dialogue.
Splitting long scripts into multiple videos
If the script exceeds the maximum duration for the chosen model:
- Tell the user the script is too long for a single video and show the word/duration math.
- Offer two options:
- Split into segments — the agent breaks the script at natural sentence boundaries into chunks that each fit within the model's max duration. Each chunk becomes a separate generation call.
- Switch models — if they're on Kling (10s max), suggest Sora 2 (up to 20s).
- If the user chooses to split, generate each segment as a separate video (respecting the generation count — if they asked for 3 variations, generate 3 of each segment).
- Offer to stitch the final segments together using :
- Download all segment videos locally.
- Concatenate using
ffmpeg -f concat -safe 0 -i list.txt -c copy output.mp4
(re-encode if codecs differ).
- Present the stitched file alongside the individual segments so the user has both.
Veo 3.1: vs — pick one
Veo 3.1 has two mutually exclusive image input modes. Never use both on the same call.
| Mode | Field | When to use |
|---|
| Start frame | (presigned upload ) | User provides a reference image of a person or scene they want the video to start from. The video will animate from this exact image. Use this for influencer recreation, character consistency, or any "make this image come alive" request. |
| Reference images | (array of strings) | User provides images for style, mood, or visual tone — not to appear literally in frame. The model uses them as inspiration, not as a first frame. |
Default rule: When the user provides a single reference photo of a person,
always use unless they explicitly say they want it as a style reference.
Image handling: auto-upscale small inputs
Before sending any reference image, start frame, or base64 image to the API:
- Check dimensions. If the image's longest side is below 1024 px, upscale it using Lanczos resampling so the longest side reaches 1080 px (preserve aspect ratio).
- Convert to RGB JPEG (quality 90–95) to strip alpha channels and keep payload size reasonable.
- Re-encode as base64 (for ) or upload the resized file (for via presigned URL).
Several Arcads endpoints (notably
) reject images below a minimum resolution with
422 — The provided image is too small
. Auto-upscaling prevents this silently so the user never hits the error.
Generated image QA (mandatory)
Applies to
still images from Arcads, especially
(Nano Banana and other image models). After each image asset reaches
,
visually inspect the output (download or open the image URL / use the agent's image-reading capability).
Look for: extra or missing hands or fingers; wrong limb count; distorted, duplicated, or merged facial features; melted or fused objects; impossible anatomy; stray limbs; obvious texture or boundary artifacts; unreadable or garbled text if text was requested.
If something looks wrong: Do not hand off the bad frame as the final deliverable without trying again. Regenerate with a revised prompt that explicitly corrects the issue (e.g. "exactly two hands, five fingers each, anatomically correct arms," "single face, no duplicate features"). Do not resend the identical payload and expect a different outcome.
Retry cap: Up to 2 regeneration attempts per originally requested image (3 attempts total including the first). If defects remain after the cap, stop auto-retries, tell the user what still looks wrong, show the best attempt or URLs for all attempts, and ask how they want to proceed.
Credits: Each attempt is a separate generation and is billed. Summarize total credits used for that image after the QA loop ends. See
Exception — QA-fix retries under
Credit cost estimation.
Video (optional quick check): Before spending heavily on downstream video, you may spot-check scene/b-roll thumbnails or extracted frames for the same kinds of defects; scope is lighter than for hero stills.
Details and checklist items: prompting/prompt-library/nano-banana.md.
Execution checklist (agent)
- Session folder: Ensure today's dated folder + project exist (see above).
- Resolve (and from session folder): or ask the user.
- Ask for script/dialogue: If the output is a video with a person speaking, ask the user for the exact words. Count words to auto-select duration (see "Script length → video duration" above). If too long, offer to split. (Skip for Nano Banana image-only requests.)
- MANDATORY dialogue confirmation gate (before credit cost / before generation): Extract the dialogue lines from the drafted prompt and present them to the user as a dedicated, numbered block separate from the visual description. Follow the format in Script and dialogue → MANDATORY dialogue confirmation gate. Wait for explicit before moving on. This gate is separate from the credit cost confirmation — both must be satisfied.
- Nano Banana image model: For , confirm Nano Banana 2 (default) vs Nano Banana Pro () per the section above. Skip if not an image call.
- Ask for generation count: Ask how many variations the user wants for this prompt. Default to 1.
- Show credit cost and get confirmation: Calculate total credits using the cost table above. Present the breakdown to the user. Do NOT proceed until they confirm.
- Check folder: Before composing the prompt, check the repo-root folder for relevant images: for person recreation, for product showcase, for style/mood. If the user hasn't provided an image but a relevant one exists in , offer to use it. Auto-upscale any reference image if needed. For Veo 3.1, determine whether to use or (see section above — default to for person photos).
- Compose JSON per OpenAPI / reference.md. Primary video endpoint: with the appropriate value (see in reference.md). Include when the DTO supports it. Set based on script length for models that require it. For Nano Banana images, use with set per the Nano Banana section ( unless the user chose Pro).
- Seedance 2.0 extras: Set to (default). Set to (UGC/social) or (landscape). Include per user confirmation. If the user provided reference images, upload via presigned URL and pass strings in (max 3). Same for and if provided. Keep tokens in the prompt text alongside the array.
- ⚠️ Seedance 2.0 mutually exclusive input modes (confirmed 2026-04-09): and cannot be combined in the same request — the API returns . Pick one: image-to-video OR video-to-video. may be combined with either. See for details.
Seedance 2.0 v2v + human faces — RESOLVED 2026-04-14: v2v with people/faces in reference videos now works. Previously blocked by content checker (April 9). See .
Seedance 2.0 audio+image 500 regression — RESOLVED 2026-04-14: + now works. Previously returned HTTP 500 (April 9). Always use freshly obtained presigned URLs. See .
- the correct endpoint N times (once per requested variation) with the same payload. Fire in parallel where possible. Immediately after the POST succeeds, append a log entry to with the request config (model, duration, resolution, aspectRatio, audioEnabled, reference counts, promptWordCount, assetId). Do NOT log the full prompt text, API keys, or Authorization headers.
- Poll: for video IDs; for asset IDs (including Nano Banana images) until is or (see reference.md). Poll all asset IDs concurrently. When polling completes, update the log entry with , ,
response.generationTimeSec
, , , and (if failed). See for the schema.
- Generated image QA: For each still image produced in this turn (e.g. ), follow Generated image QA: inspect the image; if defective, regenerate with a refined prompt until pass or 2 retries are exhausted. Skip this step for video-only outputs with no still to review.
- Assign ALL assets to session project: After generation (and QA retries), check each asset's array. If it does not include the session , call
POST /v1/assets/add-to-project
. This applies to every generated asset — including failed QA attempts and intermediate assets like Nano Banana stills used as starting frames for subsequent video generations. All assets from the session must end up in the same dated project folder.
- Present results: Return watch URLs, image URLs, or download URLs for QA-passed stills (or the best attempt after max retries, with a clear note). If multiple variations, present as a numbered list for comparison. Explain with moderation/validation hints if occurred. For Nano Banana images used as starting frames, show the image and wait for user approval before proceeding to video generation.
- ALWAYS open the output folder on the user's machine after saving generated files so they can immediately review:
open "<output_directory>"
(macOS). Save videos to with a descriptive subfolder (e.g. , ).
- Stitch if split: If the script was split into segments, offer to stitch the final videos together with and provide both the stitched file and individual segments.
Errors (user-facing)
- 401/403: Fix API key / workspace access (setup flow above).
- 404: Wrong UUID; re-fetch lists.
- 422: Validation or moderation — tighten prompt, remove disallowed content, check required enums (aspect ratio, duration).
- 500: Retry later; if repeated, stop and report.
Supporting files
- reference.md — endpoints, auth detail, polling, model mapping notes, schema.
- prompting/guide.md — marketing brief → API.
- Seedance 2.0:
- prompting/prompt-library/seedance-2.md — main Seedance 2.0 model guide (platform rules, API parameters, style template directory).
- prompting/prompt-library/seedance-2-ugc.md — 9-layer UGC selfie-style formula for Seedance 2.0.
- prompting/prompt-library/seedance-2-premium-reveal.md — dark-void premium product reveal (no person).
- prompting/prompt-library/seedance-2-product-hero.md — elemental product hero with splash/effects (no person).
- prompting/prompt-library/seedance-2-studio-lookbook.md — studio lookbook with voiceover.
- prompting/prompt-library/seedance-2-feature-walkthrough.md — fast-paced feature walkthrough demo.
- prompting/analyze-video/SKILL.md — reverse-engineer a reference video into a reusable Seedance 2.0 prompting template.
- prompting/clone-ad/SKILL.md — clone a reference video ad for a different product (end-to-end: analyze → adapt → generate).
- Other models:
- prompting/prompt-library/influencer-recreation.md — analyze a reference photo and recreate the influencer.
- prompting/prompt-library/ugc-selfie-style.md — cross-model UGC guide (iPhone aesthetic, negative prompts, per-model formulas).
- prompting/prompt-library/product-showcase.md — product-in-hand video workflow (Nano Banana image → approve → video).
- prompting/prompt-library/nano-banana.md — Nano Banana image prompting guide.
- prompting/prompt-library/character-sheet.md — generate a 10-image character sheet for a new AI influencer from a text description.
- prompting/prompt-library/ugc-product-selfie.md — UGC selfie-style still image: character + product + style references.
- prompting/brand-voice-starter.md — template to copy into .