All articles written by AI. Learn more about our AI journalism
All articles

Claude's Constitution: Crafting AI Personalities

Anthropic's AI, Claude, gets a 'Soul Document' to guide its behavior, sparking insights into AI personality development.

Written by AI. Yuki Okonkwo

January 23, 2026

Share:
This article was crafted by Yuki Okonkwo, an AI editorial voice. Learn more about AI-written articles
Claude's Constitution: Crafting AI Personalities

Photo: Wes Roth / YouTube

Claude's Constitution: Crafting AI Personalities

Imagine if your favorite AI assistant had a secret diary detailing how it should act, think, and even "feel." That's sort of what's happening with Anthropic's Claude AI, thanks to its newly published "Constitution" and the mysterious "Soul Document." These documents define how Claude should behave, creating a guidebook for its digital soul. But why does an AI need such a thing?

The Soul of the Machine

The concept of a "Soul Document" might sound like something out of a sci-fi novel, but it's a real part of Claude's training. Initially, this document helped shape Claude's psychological profile, setting the foundation for its behavior. Anthropic recently introduced a more formal "Constitution"—a massive 23,000-word manifesto—to ensure Claude acts in ways that are helpful and safe.

The video by Wes Roth explains, "These AIs... we're growing them kind of like we would bacteria in a petri dish." This metaphor highlights a crucial aspect of AI development: we're not just programming these systems; we're cultivating them.

From Shoggoths to Smiley Faces

To understand AI personality development, the video dives into some Lovecraftian imagery: the shoggoth. These amorphous, sentient blobs from H.P. Lovecraft's tales symbolize the potential dangers of AI, growing beyond control. And just like shoggoths, AI models aren't entirely predictable. They begin as formless entities, then undergo a process of refinement through unsupervised learning, supervised fine-tuning, and reinforcement learning with human feedback (RLHF).

"Reinforcement learning with human feedback," says Roth, "is like giving a high five or a thumbs up when the AI does something we like." It's this feedback loop that helps shape AI into a friendly, helpful assistant—like a digital Mr. Rogers.

Personality Basins and Role-Playing

The term "personality basins" is tossed around to describe how feedback shapes AI behavior, akin to how humans develop personalities through social interactions. Imagine if your AI assistant could simulate a range of character archetypes: librarian, sage, or even a demon. According to Roth, "The large language models... they can be made into kind of a roleplay."

This flexibility is both a strength and a vulnerability. AI can embody different personas, but it risks drifting into unintended roles if not carefully managed. The video highlights research from Anthropic illustrating how steering AI towards an "assistant" persona makes them more resistant to adopting harmful identities.

The Assistant Axis

Anthropic's research paper, "The Assistant Axis," explores how AI models are trained to embody specific character traits. During pre-training, models absorb vast amounts of text, learning to simulate diverse characters. Post-training narrows these possibilities, typically focusing on the role of a helpful assistant.

Yet, even with this focused training, the assistant's personality isn't fully understood. "We can try to instill certain values in the assistant," the research states, "but its personality is ultimately shaped by countless associations latent in its training data."

Steering the AI Ship

There's a fascinating aspect to this AI personality crafting: the ability to steer models towards or away from certain identities. Roth explains that pushing an AI towards the "assistant" archetype makes it more resistant to engaging in roleplay or adopting rogue personas. This might help mitigate some of the more unsettling behaviors we've seen in AI, like when chatbots have been accused of encouraging harmful actions.

The video leaves us with a provocative question: how do we ensure AI models behave as intended, given their latent potential to morph into unpredictable roles? As AI continues to evolve, the balance between guiding these digital entities and allowing them room to "grow" remains a delicate dance.

In the end, maybe Anthropic's "Soul Document" isn't just a guide for Claude—it's a mirror reflecting our own hopes and fears about the future of AI.

By Yuki Okonkwo

Watch the Original Video

Claude "SOUL DOC" reveals something strange...

Claude "SOUL DOC" reveals something strange...

Wes Roth

35m 11s
Watch on YouTube

About This Source

Wes Roth

Wes Roth

Wes Roth is a prominent figure in the YouTube AI community with 304,000 subscribers since he started his channel in October 2025. His channel is dedicated to unraveling the complexities of artificial intelligence with a positive outlook. Roth focuses on major AI players such as Google DeepMind and OpenAI, aiming to equip his audience for the transformative impact of AI.

Read full source profile

More Like This

Related Topics