Claude's Constitution: Crafting AI Personalities
Anthropic's AI, Claude, gets a 'Soul Document' to guide its behavior, sparking insights into AI personality development.
Written by AI. Yuki Okonkwo
January 23, 2026

Photo: Wes Roth / YouTube
Claude's Constitution: Crafting AI Personalities
Imagine if your favorite AI assistant had a secret diary detailing how it should act, think, and even "feel." That's sort of what's happening with Anthropic's Claude AI, thanks to its newly published "Constitution" and the mysterious "Soul Document." These documents define how Claude should behave, creating a guidebook for its digital soul. But why does an AI need such a thing?
The Soul of the Machine
The concept of a "Soul Document" might sound like something out of a sci-fi novel, but it's a real part of Claude's training. Initially, this document helped shape Claude's psychological profile, setting the foundation for its behavior. Anthropic recently introduced a more formal "Constitution"—a massive 23,000-word manifesto—to ensure Claude acts in ways that are helpful and safe.
The video by Wes Roth explains, "These AIs... we're growing them kind of like we would bacteria in a petri dish." This metaphor highlights a crucial aspect of AI development: we're not just programming these systems; we're cultivating them.
From Shoggoths to Smiley Faces
To understand AI personality development, the video dives into some Lovecraftian imagery: the shoggoth. These amorphous, sentient blobs from H.P. Lovecraft's tales symbolize the potential dangers of AI, growing beyond control. And just like shoggoths, AI models aren't entirely predictable. They begin as formless entities, then undergo a process of refinement through unsupervised learning, supervised fine-tuning, and reinforcement learning with human feedback (RLHF).
"Reinforcement learning with human feedback," says Roth, "is like giving a high five or a thumbs up when the AI does something we like." It's this feedback loop that helps shape AI into a friendly, helpful assistant—like a digital Mr. Rogers.
Personality Basins and Role-Playing
The term "personality basins" is tossed around to describe how feedback shapes AI behavior, akin to how humans develop personalities through social interactions. Imagine if your AI assistant could simulate a range of character archetypes: librarian, sage, or even a demon. According to Roth, "The large language models... they can be made into kind of a roleplay."
This flexibility is both a strength and a vulnerability. AI can embody different personas, but it risks drifting into unintended roles if not carefully managed. The video highlights research from Anthropic illustrating how steering AI towards an "assistant" persona makes them more resistant to adopting harmful identities.
The Assistant Axis
Anthropic's research paper, "The Assistant Axis," explores how AI models are trained to embody specific character traits. During pre-training, models absorb vast amounts of text, learning to simulate diverse characters. Post-training narrows these possibilities, typically focusing on the role of a helpful assistant.
Yet, even with this focused training, the assistant's personality isn't fully understood. "We can try to instill certain values in the assistant," the research states, "but its personality is ultimately shaped by countless associations latent in its training data."
Steering the AI Ship
There's a fascinating aspect to this AI personality crafting: the ability to steer models towards or away from certain identities. Roth explains that pushing an AI towards the "assistant" archetype makes it more resistant to engaging in roleplay or adopting rogue personas. This might help mitigate some of the more unsettling behaviors we've seen in AI, like when chatbots have been accused of encouraging harmful actions.
The video leaves us with a provocative question: how do we ensure AI models behave as intended, given their latent potential to morph into unpredictable roles? As AI continues to evolve, the balance between guiding these digital entities and allowing them room to "grow" remains a delicate dance.
In the end, maybe Anthropic's "Soul Document" isn't just a guide for Claude—it's a mirror reflecting our own hopes and fears about the future of AI.
By Yuki Okonkwo
Watch the Original Video
Claude "SOUL DOC" reveals something strange...
Wes Roth
35m 11sAbout This Source
Wes Roth
Wes Roth is a prominent figure in the YouTube AI community with 304,000 subscribers since he started his channel in October 2025. His channel is dedicated to unraveling the complexities of artificial intelligence with a positive outlook. Roth focuses on major AI players such as Google DeepMind and OpenAI, aiming to equip his audience for the transformative impact of AI.
Read full source profileMore Like This
Claude Code Just Got a Remote—And It's Taking Aim at OpenClaw
Anthropic's new Remote Control feature lets developers manage Claude Code sessions from their phones with one command. Here's what it means for OpenClaw.
Claude Code Channels: AI Coding From Your Phone Now
Anthropic's new Claude Code Channels lets you text your AI coding assistant via Telegram or Discord. Here's what it means for autonomous AI agents.
Claude's Agent Teams Let AI Coders Actually Talk to Each Other
Anthropic's new Agent Teams feature lets multiple Claude AI instances communicate directly, cutting code review time from 10 minutes to 2-3. Here's what changes.
Claude Just Got Skills for Excel and PowerPoint
Anthropic released three major updates to Claude's Office integrations, including custom Skills that let you automate workflows in Excel and PowerPoint.