![AvatarFX by c.ai]()
We’re thrilled to unveil AvatarFX, an innovative breakthrough from the Character.AI Multimodal team that can transform static images into photorealistic, expressive videos, making them speak, sing, and emote with just a click.
What is AvatarFX?
AvatarFX advances the state of the art in video generation technology with features like:
- Photorealistic video generation with synchronized audio
- Strong temporal consistency across face, hand, and body movements
- Support for long-form videos with fluid, realistic motion
- Generation of top-quality videos from a single pre-existing image, offering users maximum creative control
- Multi-speaker and multi-turn video generation capabilities
You can check out some amazing demos here.
![AvatarFX by c.ai demo]()
We’re actively working to integrate the AvatarFX model into the Character.AI product over the coming months, making these cutting-edge video features accessible to all users.
CAI+ subscribers will be the first to experience these new capabilities when they launch.
How Does AvatarFX Work?
Achieving such realism and expressive nuance requires sophisticated technology:
- Our team built a parameter-efficient training pipeline leveraging flow-based diffusion models on top of the DiT architecture. This enables realistic lip-sync, head, and body movements driven by audio sequences.
- A novel inference strategy preserves visual quality, motion consistency, and expressive diversity even across arbitrarily long videos.
- Our data experts curated a rich, diverse dataset spanning a wide range of styles — from realistic humans and mythical creatures to inanimate objects with faces — filtering out low-quality data to train a powerful generative model.
- On the audio front, we use Character.AI’s proprietary Text-to-Speech (TTS) voice model to generate lifelike voices.
- To ensure speed and efficiency, we utilize distillation techniques that reduce diffusion steps, accelerating inference with minimal loss in quality.
What Sets AvatarFX Apart?
AvatarFX offers several technical breakthroughs:
Generates high-quality videos of 2D animated characters, 3D cartoons, and non-human faces (like your favorite pet!)
Maintains top-tier temporal consistency in facial, hand, and body movements—even in long-form video content
Allows generation from a single existing image rather than relying solely on text-to-image, giving users greater control over their outputs
These advances will empower Character.AI to revolutionize storytelling, making it easier and more fun for creators to bring their characters and stories to life.
![Character.AI - AvatarFX]()
From Lab to Launch: Scaling AvatarFX for Everyone
Our mission is to make AvatarFX affordable, intuitive, and accessible for all Character.AI users. Our expert team of designers and engineers is hard at work optimizing every layer of the technology stack—from GPU orchestration and caching to queuing and media delivery.
When integrated into the platform, generating a high-quality video will feel as seamless as clicking a single “Generate” button.
Commitment to Safety
Even in this testing phase, we prioritize user safety and ethical use through:
Robust content safety filters on user dialogues to block harmful or policy-violating content. Industry-leading tools to block videos generated from photos of minors, high-profile politicians, and other notable figures. AI-based anonymization to alter human photos so that individuals are not recognizable
Watermarking all generated videos to clearly indicate that they are not real footage. A strict terms of use agreement that prohibits impersonation, bullying, unauthorized use of protected intellectual property, and misuse of the technology. A transparent one-strike policy for violations that ensures accountability