Developer Guide

Analyze Script Workflow

End-to-end pipeline that transforms a script into a storyboard with images, motion video, and music

End-to-end pipeline that transforms a user's script into a complete storyboard with images, motion video, and music.

High-Level Overview

Timing source: Measured from local QStash logs for a 9-scene run. Phase 4 runs image generation and motion/music prompt generation in parallel, reducing wall-clock time significantly.

Triggering Flow

The pipeline starts from server handlers in src/functions/sequences.ts:

  1. createSequenceFn — Creates a new sequence record, then calls triggerWorkflow('/storyboard', input) via QStash
  2. updateSequenceFn — If script, style, aspect ratio, or analysis model changed, triggers the same workflow
  3. retryStoryboardFn — Retries a failed sequence (resets status to processing, re-triggers)

All three use triggerWorkflow() from src/lib/workflow/client.ts, which:

  • Resolves the webhook URL (rewrites localhost to host.docker.internal for local dev)
  • Calls WorkflowClient.trigger() with the URL {baseUrl}/api/workflows/storyboard
  • Returns a workflowRunId for tracking

Input shape (StoryboardWorkflowInput):

FieldTypePurpose
userIdstringAuth context
teamIdstringAuth context
sequenceIdstringTarget sequence
optionsobjectframesPerScene, generateThumbnails, etc.
autoGenerateMotionbooleanWhether to generate video for each frame
autoGenerateMusicbooleanWhether to generate music for the sequence
musicModelstring?Override music model
imageModelsstring[]?Multiple image models for parallel gen
suggestedTalentIdsstring[]?Pre-selected talent for casting
suggestedLocationIdsstring[]?Pre-selected locations for matching

Storyboard Workflow

File: src/lib/workflows/storyboard-workflow.ts

The storyboard workflow validates data, generates a poster image, then delegates to the analyze-script workflow.

Step: verify-clear-and-start-processing

  1. Validates auth via validateSequenceAuth()
  2. Loads sequence with getSequenceForUser() — checks script and style exist
  3. Loads and parses the style config
  4. Deletes all existing frames for the sequence
  5. Sets sequence status to processing
  6. Returns resolved models: analysisModelId, imageModel, videoModel

Step: generate-poster

  • Generates a poster image from the script+title+style for the video player empty state
  • Non-critical — failures are logged and swallowed
  • Emits generation.poster:ready with the URL

Then invokes analyzeScriptWorkflow with retries (3 attempts, exponential backoff).

After the analyze-script workflow completes, marks status as completed and emits generation.complete.

Analyze Script Workflow — Phase-by-Phase

File: src/lib/workflows/analyze-script-workflow.ts

This is the core orchestration workflow. It uses context.invoke() for sub-workflows and Promise.all() for parallelism.

Phase 1: Scene Splitting (Streaming LLM)

Sub-workflow: sceneSplitWorkflow (src/lib/workflows/scene-split-workflow.ts)

Uses streaming LLM output to create frames progressively as scenes arrive, plus triggers preview image generation for each scene.

Steps:

  1. prepare-scene-splitting — Fetches the prompt template from Langfuse
  2. scene-splitting-stream — Streams the LLM response through createStreamingSceneParser():
    • Parses incremental JSON chunks via partial-json
    • On each complete scene: calls upsertFrame() to create/update the frame in DB, emits generation.scene:new and generation.frame:created
    • On title detection: updates the sequence title, emits generation.updated
    • Preview images: After each scene completes, triggers an image workflow (fire-and-forget via triggerWorkflow) using PREVIEW_IMAGE_MODEL for instant visual feedback
    • On scene:updated events: upserts frame with partial metadata as scenes stream in
  3. reconcile-frames — Bulk upserts all frames via bulkInsertFrames() to handle QStash replay safety (idempotent on sequenceId + orderIndex conflict). Also emits frame:created for any frames missed during streaming.
  4. deduct-llm-credits-scene-splitting — Credit deduction
  • Prompt: phase/scene-splitting-chat
  • Variables: { aspectRatio, script } (script is sanitized)
  • Response schema: sceneSplittingResultSchema
  • Output: { scenes[], title, frameMapping[] }frameMapping is an array of { sceneId, frameId } used throughout remaining phases

Phase 2: Casting Characters & Locations (Parallel Sub-Workflows)

After scene splitting, two sub-workflows run in parallel via Promise.all([context.invoke(...)]):

Talent Matching Workflow (src/lib/workflows/talent-matching-workflow.ts):

  1. Character extractiondurableLLMCall('character-extraction') with prompt phase/character-extraction-chat
    • Input: { scenes } (JSON-serialized)
    • Output: { characterBible } — array of characters with physical descriptions, clothing, consistency tags
  2. Talent matching (skipped if no suggestedTalentIds):
    • Loads talent records from DB by IDs
    • LLM matches characters to talent
    • Deduplicates matches (each talent/character used once), emits generation.talent:matched
  3. Returns: { characterBible, matches: talentCharacterMatches }

Location Matching Workflow (src/lib/workflows/location-matching-workflow.ts):

  1. Location extractiondurableLLMCall('location-extraction') with prompt phase/location-extraction-chat
    • Input: { scenes } (JSON-serialized)
    • Output: { locationBible } — array of locations with descriptions, architecture, color palettes
  2. Location matching (skipped if no suggestedLocationIds):
    • Loads library locations from DB by IDs
    • LLM matches locations to library entries (requires confidence >= 0.5)
    • Deduplicates matches, emits generation.location:matched
  3. Returns: { locationBible, matches: libraryLocationMatches }

Phase 3: References & Prompts (Parallel Sub-Workflows)

Three sub-workflows invoked in parallel via Promise.all([context.invoke(...)]):

Character Bible Workflow (src/lib/workflows/character-bible-workflow.ts):

  • Generates a reference sheet image for each character (parallel per character)
  • Uses talent match images as reference when available
  • Uploads sheets to R2 storage
  • Creates sequence_characters DB records

Location Bible Workflow (src/lib/workflows/location-bible-workflow.ts):

  • Inserts location records into DB from location bible
  • Generates establishing-shot reference images for each location (parallel)
  • Uses library location reference images when matched
  • Uploads to R2 storage, updates DB

Visual Prompt Workflow (src/lib/workflows/visual-prompt-workflow.ts):

  • Delegates to visualPromptSceneWorkflow per scene (parallel via context.invoke)
  • Each scene gets an LLM call that generates a fullPrompt, negativePrompt, and continuity data (character tags, environment tag, color palette, lighting, style tag)
  • Merges results back into scene objects

Phase 4: Images + Motion/Music Prompts (Parallel)

This is the key parallelization — image generation and motion/music prompt generation run simultaneously since they have no dependency on each other.

Frame Images Workflow (src/lib/workflows/frame-images-workflow.ts):

  1. Builds per-scene character and location reference maps
  2. For each scene, generates images with each selected model in parallel:
    • Invokes generateImageWorkflow per scene per model (retries: 3, exponential backoff)
    • After each image completes, invokes generateVariantWorkflow for shot grid variants
  3. Returns { imageUrls } — primary model's URL per scene

Motion + Music Prompts Workflow (src/lib/workflows/motion-music-prompts-workflow.ts):

  1. Snap durations — Snaps scene durations to video model capabilities upfront so both motion prompts and music design see identical values
  2. Parallel generation — Motion prompts and music design run simultaneously:
    • motionPromptWorkflow — Per-scene LLM calls for camera movement, motion style, timing
    • generateMusicPromptWorkflow — Single LLM call classifying per-scene music requirements + generating unified prompt with tags
  3. Merge — Combines motion prompts and music design into completeScenes[]
  4. Returns: { completeScenes, musicPrompt, musicTags }

Phase 5: Motion + Music Generation (Conditional)

Sub-workflow: motionBatchWorkflow (src/lib/workflows/motion-batch-workflow.ts)

Only runs if autoGenerateMotion is enabled, a video model is set, and images were generated. A single orchestrator handles:

  1. Parallel generation — All frame motion workflows + optional music workflow invoked simultaneously
  2. Collect video URLs — Reads from DB (authoritative ordering by orderIndex)
  3. Merge video — Concatenates all frame videos into one sequence video
  4. Merge audio+video — If music was generated, muxes audio onto the merged video

Final: Record Trace + Return

Step: record-workflow-trace

  • Records a trace to Langfuse for observability (input script, style config, aspect ratio, complete scenes, timing)

Returns the completeScenes array.

Data Flow: Scene Object Accumulation

Each phase enriches the Scene object. The frame's metadata column is updated after visual prompts to persist intermediate results. Phase 1 creates frames progressively during streaming and triggers preview images for instant feedback.

Scene type fields (from src/lib/ai/scene-analysis.schema.ts):

FieldAdded ByNotes
sceneIdPhase 1Required, unique
sceneNumberPhase 1Required, 1-indexed
originalScriptPhase 1{ extract, dialogue }
metadataPhase 1{ title, durationSeconds, location, timeOfDay, storyBeat }
prompts.visualPhase 3{ fullPrompt, negativePrompt, components }
continuityPhase 3{ characterTags, environmentTag, colorPalette, lightingSetup, styleTag }
prompts.motionPhase 4{ fullPrompt, components, parameters }
musicDesignPhase 4{ presence, style, mood, atmosphere }
sourceImageUrlOptionalURL of generated or uploaded source image

Real-Time Events

Events emitted via Upstash Realtime on a per-sequence channel (getGenerationChannel(sequenceId)).

EventWhen EmittedPayload
generation.phase:startBefore each LLM call or generation phase{ phase, phaseName }
generation.phase:completeAfter each phase completes{ phase }
generation.poster:readyStoryboard workflow — after poster generated{ posterUrl }
generation.scene:newPhase 1 — progressively as scenes stream in{ sceneId, sceneNumber, title, scriptExtract, durationSeconds }
generation.scene:updatedPhase 1 — as scene metadata updates during stream{ sceneId, sceneNumber, title, scriptExtract, durationSeconds }
generation.updatedPhase 1 — after title detected in stream{ title }
generation.frame:createdPhase 1 — progressively as frames are upserted{ frameId, sceneId, orderIndex }
generation.frame:updatedPhase 4 — after prompts written to DB{ frameId, updateType, metadata }
generation.talent:matchedPhase 2 — when talent matched to characters{ matches: [{ characterId, characterName, talentId, talentName }] }
generation.talent:unmatchedPhase 2 — unused talent after matching{ unusedTalentIds, unusedTalentNames }
generation.location:matchedPhase 2 — when locations matched to library{ matches: [{ locationId, locationName, libraryLocationId, ... }] }
generation.image:progressImage workflow — generating/completed/failed{ frameId, status, thumbnailUrl? }
generation.variant-image:progressVariant workflow — generating/completed/failed{ frameId, status, variantImageUrl? }
generation.video:progressMotion workflow — generating/completed/failed{ frameId, status, videoUrl? }
generation.audio:progressMusic workflow — generating/completed/failed{ status, audioUrl? }
generation.character-sheet:progressCharacter bible — per character{ characterId, status, sheetImageUrl? }
generation.location-sheet:progressLocation bible — per location{ locationId, status, referenceImageUrl? }
generation.recast:startRecast character — before regenerating frames{ characterId, frameCount }
generation.recast:completeRecast character — all frames regenerated{ characterId, successCount, failedCount }
generation.recast:failedRecast character — on failure{ characterId, error }
generation.recast-location:startRecast location — before regenerating frames{ locationId, frameCount }
generation.recast-location:completeRecast location — all frames regenerated{ locationId, successCount, failedCount }
generation.recast-location:failedRecast location — on failure{ locationId, error }
generation.errorOn non-fatal workflow error{ message, phase? }
generation.failedOn workflow failure{ message }
generation.completeStoryboard workflow — after everything finishes{ sequenceId }

Error Handling

Failure Function

The analyze-script workflow registers a failureFunction that:

  1. Sanitizes the error via sanitizeFailResponse() — extracts inner errors from QStash wrapper patterns, maps known Cloudflare error codes (e.g., 1102 → "Worker exceeded memory limit"), and truncates messages over 500 characters
  2. Updates sequence status to 'failed' with the error message
  3. Emits generation.failed with the sanitized error

Sub-workflows (image, motion, music, character bible, location bible, talent matching, location matching, frame-images, motion-batch) each have their own failure functions that update the relevant record's status to 'failed'.

Retry Strategy

LevelRetriesBackoff
Storyboard invoking analyze-script3Exponential (2^retried * 1000ms)
Image generation per scene3Exponential
Variant generation per scene3Exponential
Motion generation per frame3Exponential
Music generation3Exponential
Individual context.run() stepsManaged by QStash (automatic)

QStash Durability

  • Each context.run() step is checkpointed — if the server restarts mid-workflow, execution resumes from the last completed step
  • context.invoke() creates a child workflow that runs independently with its own retries
  • No application-level concurrency gating — fal queues submissions server-side (IN_QUEUE doesn't count toward the cap, jobs are never rejected), and OpenRouter handles its own rate limits. Past attempts at gating via QStash flowControl produced ghost slot leaks on cancel and PR-preview cross-contamination; see #725.

Key Files Reference

FilePurpose
src/functions/sequences.tsServer functions that trigger the pipeline
src/lib/workflow/client.tstriggerWorkflow() — QStash integration
src/routes/api/workflows/$.tsWorkflow route registration (serveMany)
src/lib/workflows/storyboard-workflow.tsWrapper: verify, clear, poster, invoke analyze-script
src/lib/workflows/analyze-script-workflow.tsCore orchestration (phases 1-5)
src/lib/workflows/scene-split-workflow.tsPhase 1: streaming scene split + preview images
src/lib/ai/streaming-scene-parser.tsIncremental JSON parser for streaming scene creation
src/lib/workflow/sanitize-fail-response.tsError message extraction from QStash failures
src/lib/db/helpers/frames.tsupsertFrame() / bulkInsertFrames() idempotent helpers
Extraction + Matching
src/lib/workflows/talent-matching-workflow.tsCharacter extraction + talent matching sub-workflow
src/lib/workflows/location-matching-workflow.tsLocation extraction + location matching sub-workflow
Reference Generation
src/lib/workflows/character-bible-workflow.tsCharacter sheet generation (parallel per character)
src/lib/workflows/character-sheet-workflow.tsSingle character sheet image generation
src/lib/workflows/location-bible-workflow.tsLocation sheet generation (parallel per location)
src/lib/workflows/location-sheet-workflow.tsSingle location reference image generation
Prompt Generation
src/lib/workflows/visual-prompt-workflow.tsVisual prompt sub-workflow (parallel per scene)
src/lib/workflows/visual-prompt-scene-workflow.tsPer-scene visual prompt LLM call
src/lib/workflows/motion-prompt-workflow.tsMotion prompt sub-workflow (parallel per scene)
src/lib/workflows/motion-prompt-scene-workflow.tsPer-scene motion prompt LLM call
src/lib/workflows/motion-music-prompts-workflow.tsOrchestrates motion + music prompts in parallel
src/lib/workflows/music-prompt-workflow.tsMusic design LLM call
Image Generation
src/lib/workflows/frame-images-workflow.tsOrchestrates image + variant gen for all scenes
src/lib/workflows/image-workflow.tsSingle image generation (Fal.ai)
src/lib/workflows/variant-workflow.tsShot grid variant generation
Motion + Music Generation
src/lib/workflows/motion-batch-workflow.tsOrchestrates motion + music + merge
src/lib/workflows/motion-workflow.tsSingle motion/video generation (Fal.ai)
src/lib/workflows/music-workflow.tsMusic generation (Fal.ai)
src/lib/workflows/merge-video-workflow.tsMerge frame videos into sequence video
src/lib/workflows/merge-audio-video-workflow.tsMerge music audio with video
Recasting + Regeneration
src/lib/workflows/recast-character-workflow.tsRecast a character and regenerate affected frames
src/lib/workflows/recast-location-workflow.tsRecast a location and regenerate affected frames
src/lib/workflows/regenerate-frames-workflow.tsRegenerate specific frames with new prompts
Schemas + Events
src/lib/realtime/index.tsReal-time event schema and channel helpers
src/lib/ai/scene-analysis.schema.tsScene type definition
src/lib/ai/response-schemas.tsmusicDesignResultSchema and other LLM response schemas