Kuaishou GameMind Lab

Cutscene Agent

An LLM Agent Framework for Automated 3D Cutscene Generation

Transform natural-language scripts into fully editable Unreal Engine cutscenes — with coordinated character animation, dialogue, and cinematography — in minutes, not weeks.

Cutscene Agent Overview
Scroll

Cutscenes are indispensable components of modern video games, serving as the primary vehicle for narrative delivery and emotional engagement. However, cutscene production remains one of the most complex workflows in digital content creation. We present Cutscene Agent, an LLM agent framework that automates end-to-end cutscene generation — transforming natural-language scripts into industry-grade, editable Unreal Engine Level Sequences with coordinated character animation, cinematography, dialogue, and sound design.

Key Contributions

01
LLM Agent
MCP
UE5

Cutscene Toolkit

A comprehensive MCP-based interface library for bidirectional LLM–Engine integration. Agents invoke engine operations and observe real-time scene state — enabling closed-loop generation of editable, engine-native cinematic assets.

Character Mgmt Camera Templates Asset Query Scene Perception
02
Director
Animation
Camera
Sound
Visual Feedback Loop

Multi-Agent System

A director agent orchestrates specialist subagents for animation, cinematography, and sound. A closed-loop visual reasoning mechanism enables agents to perceive rendered frames and iteratively refine camera composition and staging.

Subagent Delegation Visual Reasoning Context Mgmt
03
L1 Tool-Use Correctness
L2 Structural Integrity
L3 Cinematic Quality

CutsceneBench

The first benchmark targeting long-horizon, interdependent tool-use evaluation for cinematic generation. Each scenario requires coordinating dozens of dependent tool calls across a three-layer assessment — from tool-call correctness to structural integrity to final cinematic quality.

Long-Horizon Interdependent Tools LLM-as-Judge Multi-Dimensional

How It Works

A director agent interprets natural-language scripts via a prompt & context manager, delegates to specialist subagents (animation, camera, sound), and interacts bidirectionally with Unreal Engine 5 through an MCP-based cutscene toolkit — producing fully editable Level Sequences with a visual feedback loop for iterative refinement.

Generated Cutscenes

The following demo videos are automatically generated using Opus 4.6 + Cutscene Agent. Characters shown are MetaHuman assets; lighting and rendering are done by artists.

Video Replica Input is a video — audio and character performance are extracted automatically via a video-understanding sub-agent
One-liner Input is a single sentence, expanded by the agent into a full cutscene
Full Script Input is a full script — characters, animations, dialogue, and cinematography are all orchestrated by the agent

Bathhouse Dispute

Video Replica
Re-created from original video
Video thumbnail

Bar Encounter

One-liner
One-liner Input

Cooper and Gavin run into each other at a bar on Friday night. They have a lighthearted conversation, catching up on each other's work, warmly asking about each other's families, and finally deciding to grab a drink together to celebrate the weekend.

Video thumbnail

The Godfather Recreation

Full Script
Input Script
Don Vito

"Why did you go to the police? Why didn't you come to me first?"

[calm, probing, seated in shadow]
Bonasera

"What do you want of me? Tell me anything, but do what I beg you to do."

[desperate, leaning forward]
Don Vito

"That I cannot do."

[quiet refusal, unmoved]
Bonasera

"I'll give you anything you ask."

[offers payment immediately, almost pleading]
+ more lines …
The Godfather recreation thumbnail

Let the Bullets Fly Recreation

Video Replica
Re-created from original video
Let the Bullets Fly recreation thumbnail

Camera Control Example

Artifacts generated by Cutscene Agent can be used as control conditions for video generation models, enabling more precise camera control.

Camera control demo thumbnail

Benchmark Results

Several frontier LLMs evaluated across 65 scenarios on CutsceneBench's three-layer hierarchy

L1 & L2 Metrics

L3 Total Scores

Narrative & Cinematic Quality — LLM-as-Judge on rendered video

Contributors

  • Team Leader Qi Gan
  • Project Leader Haozhou Pang
  • Technical Implementation Lanshan He*, Haozhou Pang*, Qi Gan, Xin Shen, Ziwei Zhang, Yibo Liu, Gang Fang, Bo Liu, Kai Sheng, Shengfeng Zeng
  • Artists & Designers Chaofan Li, Zhen Hui, Keer Zhou, Lan Zhou, Shujun Dai

* Equal Contribution