Edited by humans. Written by AI. How our editing works
All articles

Building an LLM Wiki from Karpathy's Blueprint

Nate Herk demos an AI-powered personal wiki built on Karpathy's LLM knowledge base idea. Here's what the architecture reveals about how context shapes AI reasoning.

Dev Kapoor

Written by AI. Dev Kapoor

July 4, 20266 min read
Share:
Man in headset gestures while speaking against purple glowing background with "Claude Knows Everything" text overlay

Photo: AI. Phaedra Lin

The knowledge management community has been chasing the same dream for years: a system that doesn't just store what you know but actually connects it. Personal wikis, Zettelkasten cards, networked notes — the tools keep evolving, but the core promise stays the same. Now AI has walked into that conversation and changed the terms.

Nate Herk's recent demo lays out something practically interesting: an LLM wiki built on an architectural blueprint from Andrej Karpathy, using Claude Code and Obsidian as the assembly layer. The Karpathy architecture behind this has been circulating in AI research circles — the idea that you give an LLM not just a document to summarize but an entire structured knowledge base it can navigate, update, and reason across. Herk shows what that looks like when a solo creator actually builds it for daily use.

The setup is less complicated than it sounds. You create an Obsidian vault, open it in VS Code, paste in Karpathy's gist as a schema definition, and instruct Claude Code to treat itself as your wiki agent. From there, the system ingests whatever you drop into a /raw folder — PDFs, URLs, transcripts — and splits each source into cross-linked markdown pages organized under an index and a running log. Two sources Herk ingests on camera, the Claude Fable 5 system card and an OpenAI article about GPT-5.6, produced 20 cross-linked wiki pages in roughly ten to twelve minutes.

That cross-linking is where the argument gets interesting.

"The connection that made this worth having as a wiki instead of two separate summaries," Herk notes in the video, is that the two documents reference each other — and the system surfaces that relationship automatically. He flags a detail worth pausing on: OpenAI benchmarked GPT-5.6 against Mythos Preview, not Mythos 5, and the two labs used different evaluation harnesses. Reading each document alone, that's easy to miss. The wiki caught it because it's reasoning across both simultaneously, not just filing them next to each other.

This is the genuine value proposition, and it's worth separating from the demo polish. A folder of PDFs and a wiki of cross-linked markdown pages contain identical information. What changes is navigability — for you, but more importantly, for the AI reasoning over it. The routing rules embedded in the schema are what let Claude Code traverse the wiki efficiently rather than burning tokens doing a brute-force scan of everything you own. It's less about storage and more about addressability.

Herk runs multiple wikis inside what he calls his AI OS: one for YouTube transcripts, one he calls "Herk Brain" for meeting recordings. The flat-versus-structured distinction he draws is worth understanding. His meeting transcript wiki stays flat — everything at one level, no subfolders — because the AI can search it more reliably that way. The YouTube transcript wiki is structured, with subfolders for tools, techniques, comparisons, concepts. The right architecture, he argues, depends on the nature of the data, not a preference for tidiness.

"Whether that's meeting transcripts or personal data or proposals, whatever it is that you're ingesting here, make it make sense. Not only to the AI, but make it make sense to you."

That's a reasonable design principle and also a quiet acknowledgment that this system requires ongoing curation. The AI organizes, but you're still the one deciding whether the organization makes sense. Batch ingest, check the output, adjust the schema, repeat. It's not fully autonomous — the wiki evolves through iteration, and someone has to be paying attention.

The Obsidian community has been having an adjacent version of this conversation for a while. The PKM (Personal Knowledge Management) forums are full of debates about whether AI-assisted linking actually produces understanding or just the appearance of understanding — a web of connections that feels meaningful but doesn't map to how you actually think. Herk's demo doesn't resolve that tension; it's not really trying to. His use case is more pragmatic: he wants his AI agent to know everything about his business well enough to draft emails, generate reports, and surface patterns he'd miss manually. Whether that constitutes genuine knowledge retrieval or sophisticated pattern-matching across structured text is a question you can take either way.

What's more technically concrete is the portability argument. Because every wiki page is just a markdown file, the system isn't coupled to Claude Code or any specific agent. Herk points out that you can connect whatever you want to it — the architecture is tool-agnostic. That's not a small thing. In a landscape where AI tooling changes fast enough to give anyone whiplash, building on markdown files with routing conventions is about as durable a choice as you can make.

The Fable angle in the demo is a separate story. Herk uses Anthropic's Fable model (Claude's most capable tier at the time of the video) for the front-end generation tasks — turning the wiki's raw graph of concepts into a browsable HTML interface, generating a six-month business retrospective from his aggregated data. He's candid that Fable is probably overkill for the ingest work itself. Herk reports spending close to a full day iterating with Opus 4 on an earlier version of the same visualization and still not being satisfied — the output felt too dense, too overwhelming. Fable handled the emotional register of the prompt better, he says. "Something like Opus 4.8 just doesn't understand what that means as well as Fable."

That's an interesting claim, though a hard one to evaluate from a demo. "Understands emotional prompting better" might mean Fable's output happened to match his aesthetic preference, or it might mean something more systematic about how different models weight qualitative constraints. The distinction matters if you're deciding where to spend API budget, but Herk doesn't press on it, and neither should this article pretend to resolve it.

What the demo does establish is that the knowledge base architecture is the durable part. The model on top is interchangeable — Herk says as much. The schema, the routing rules, the index and log structure, the flat-versus-hierarchical judgment call: that's where the design decisions live, and those decisions outlast whichever model happens to be best this month.

The Obsidian PKM world has spent years building conventions for how humans navigate networked notes. What Herk and Karpathy are really asking is whether those same conventions — backlinks, indexes, topic clustering — translate into efficient navigation for an AI agent. The early evidence from this demo suggests they do, at least well enough to be practically useful. Whether that holds as these wikis grow into thousands of pages, or whether the routing rules start to break down under scale, is the question neither a 14-minute demo nor a single article can answer.

The markdown files are already waiting.


Dev Kapoor covers open source software and developer communities for Buzzrag.

From the BuzzRAG Team

AI Moves Fast. We Keep You Current.

Framework breakdowns, tool comparisons, and AI coding insights — distilled from the best tech YouTube creators. Free, weekly.

Weekly digestNo spamUnsubscribe anytime

More Like This

Bold white and blue text announcing Claude Code skills upgrade, featuring NotebookLM and Gemini 3.1 logos with a terminal…

NotebookLM + Claude: Teaching AI Agents Domain Expertise

A developer demonstrates using NotebookLM to generate Claude Code skills—custom knowledge modules that teach AI agents specific domains in minutes.

Dev Kapoor·4 months ago·6 min read
Orange and black thumbnail featuring pixelated robot and brain icons with "CLAUDE CODE" text and "SELF-EVOLVING" subtitle…

Karpathy's Self-Evolving AI Wiki Tests New Memory Model

Andrej Karpathy released an architectural blueprint for AI agents that maintain their own knowledge bases. Does it solve AI's memory problem or create new ones?

Samira Barnes·3 months ago·7 min read
Orange app icon with radiating lines surrounded by gray folder tabs labeled Clients, Business, and YouTube, beside bold…

Browser Use CLI Gives AI Agents Web Control—For Free

New Browser Use CLI tool lets AI agents control browsers with plain English commands. Free, fast, and works with Claude Code—but raises questions about automation.

Dev Kapoor·3 months ago·6 min read
Two glowing red app icons with a starburst and brain symbol connected by a plus sign, with text "It knows everything" above…

Claude Code's AutoDream: AI Memory That Sleeps to Stay Sharp

Anthropic quietly released AutoDream for Claude Code—a background agent that consolidates memory files like human sleep. Here's what it means for developers.

Dev Kapoor·3 months ago·5 min read
Light green background with geometric network diagrams on the left, event details for London, UK keynote on the right,…

Anthropic's Claude Keynote: A New Era for Developers

Anthropic's Code with Claude London keynote revealed major platform shifts—from advisor strategies to managed agents. Here's what it means for developers building on Claude.

Dev Kapoor·2 months ago·7 min read
A progress bar showing 300k filled in red out of 1M total capacity, with "HUGE MISTAKE" headline and an explosion icon on…

Claude's 1M Context Window Breaks at 40% Capacity

Claude Code's million-token context degrades at 300-400k tokens. Tariq from Anthropic explains why bigger windows create bigger problems.

Dev Kapoor·2 months ago·6 min read
Retro pixel art text "CLAUDE CODE" plus blue cloud icon with terminal prompt symbol, separated by plus sign, with "Why Not…

OpenAI's Codex Plugin for Claude Code: What It Does

OpenAI's new Codex plugin extends Claude Code with external reviews and GPT models. Here's what developers need to know about capabilities and risks.

Rachel "Rach" Kovacs·3 months ago·6 min read
Man in checkered shirt pointing at logos for Claude Code and Obsidian on orange background with "RAG" text overlay

Karpathy's Obsidian Setup Challenges RAG Orthodoxy

Andrej Karpathy's markdown-based knowledge system questions whether most developers actually need traditional RAG systems at all.

Marcus Chen-Ramirez·3 months ago·5 min read

RAG·vector embedding

2026-07-04
1,572 tokens1536-dimmodel text-embedding-3-small

This article is indexed as a 1536-dimensional vector for semantic retrieval. Crawlers that parse structured data can use the embedded payload below.