Building an LLM Wiki from Karpathy's Blueprint

The knowledge management community has been chasing the same dream for years: a system that doesn't just store what you know but actually connects it. Personal wikis, Zettelkasten cards, networked notes — the tools keep evolving, but the core promise stays the same. Now AI has walked into that conversation and changed the terms.

Nate Herk's recent demo lays out something practically interesting: an LLM wiki built on an architectural blueprint from Andrej Karpathy, using Claude Code and Obsidian as the assembly layer. The Karpathy architecture behind this has been circulating in AI research circles — the idea that you give an LLM not just a document to summarize but an entire structured knowledge base it can navigate, update, and reason across. Herk shows what that looks like when a solo creator actually builds it for daily use.

The setup is less complicated than it sounds. You create an Obsidian vault, open it in VS Code, paste in Karpathy's gist as a schema definition, and instruct Claude Code to treat itself as your wiki agent. From there, the system ingests whatever you drop into a /raw folder — PDFs, URLs, transcripts — and splits each source into cross-linked markdown pages organized under an index and a running log. Two sources Herk ingests on camera, the Claude Fable 5 system card and an OpenAI article about GPT-5.6, produced 20 cross-linked wiki pages in roughly ten to twelve minutes.

That cross-linking is where the argument gets interesting.

"The connection that made this worth having as a wiki instead of two separate summaries," Herk notes in the video, is that the two documents reference each other — and the system surfaces that relationship automatically. He flags a detail worth pausing on: OpenAI benchmarked GPT-5.6 against Mythos Preview, not Mythos 5, and the two labs used different evaluation harnesses. Reading each document alone, that's easy to miss. The wiki caught it because it's reasoning across both simultaneously, not just filing them next to each other.

This is the genuine value proposition, and it's worth separating from the demo polish. A folder of PDFs and a wiki of cross-linked markdown pages contain identical information. What changes is navigability — for you, but more importantly, for the AI reasoning over it. The routing rules embedded in the schema are what let Claude Code traverse the wiki efficiently rather than burning tokens doing a brute-force scan of everything you own. It's less about storage and more about addressability.

Herk runs multiple wikis inside what he calls his AI OS: one for YouTube transcripts, one he calls "Herk Brain" for meeting recordings. The flat-versus-structured distinction he draws is worth understanding. His meeting transcript wiki stays flat — everything at one level, no subfolders — because the AI can search it more reliably that way. The YouTube transcript wiki is structured, with subfolders for tools, techniques, comparisons, concepts. The right architecture, he argues, depends on the nature of the data, not a preference for tidiness.

"Whether that's meeting transcripts or personal data or proposals, whatever it is that you're ingesting here, make it make sense. Not only to the AI, but make it make sense to you."

That's a reasonable design principle and also a quiet acknowledgment that this system requires ongoing curation. The AI organizes, but you're still the one deciding whether the organization makes sense. Batch ingest, check the output, adjust the schema, repeat. It's not fully autonomous — the wiki evolves through iteration, and someone has to be paying attention.

The Obsidian community has been having an adjacent version of this conversation for a while. The PKM (Personal Knowledge Management) forums are full of debates about whether AI-assisted linking actually produces understanding or just the appearance of understanding — a web of connections that feels meaningful but doesn't map to how you actually think. Herk's demo doesn't resolve that tension; it's not really trying to. His use case is more pragmatic: he wants his AI agent to know everything about his business well enough to draft emails, generate reports, and surface patterns he'd miss manually. Whether that constitutes genuine knowledge retrieval or sophisticated pattern-matching across structured text is a question you can take either way.

What's more technically concrete is the portability argument. Because every wiki page is just a markdown file, the system isn't coupled to Claude Code or any specific agent. Herk points out that you can connect whatever you want to it — the architecture is tool-agnostic. That's not a small thing. In a landscape where AI tooling changes fast enough to give anyone whiplash, building on markdown files with routing conventions is about as durable a choice as you can make.

The Fable angle in the demo is a separate story. Herk uses Anthropic's Fable model (Claude's most capable tier at the time of the video) for the front-end generation tasks — turning the wiki's raw graph of concepts into a browsable HTML interface, generating a six-month business retrospective from his aggregated data. He's candid that Fable is probably overkill for the ingest work itself. Herk reports spending close to a full day iterating with Opus 4 on an earlier version of the same visualization and still not being satisfied — the output felt too dense, too overwhelming. Fable handled the emotional register of the prompt better, he says. "Something like Opus 4.8 just doesn't understand what that means as well as Fable."

That's an interesting claim, though a hard one to evaluate from a demo. "Understands emotional prompting better" might mean Fable's output happened to match his aesthetic preference, or it might mean something more systematic about how different models weight qualitative constraints. The distinction matters if you're deciding where to spend API budget, but Herk doesn't press on it, and neither should this article pretend to resolve it.

What the demo does establish is that the knowledge base architecture is the durable part. The model on top is interchangeable — Herk says as much. The schema, the routing rules, the index and log structure, the flat-versus-hierarchical judgment call: that's where the design decisions live, and those decisions outlast whichever model happens to be best this month.

The Obsidian PKM world has spent years building conventions for how humans navigate networked notes. What Herk and Karpathy are really asking is whether those same conventions — backlinks, indexes, topic clustering — translate into efficient navigation for an AI agent. The early evidence from this demo suggests they do, at least well enough to be practically useful. Whether that holds as these wikis grow into thousands of pages, or whether the routing rules start to break down under scale, is the question neither a 14-minute demo nor a single article can answer.

The markdown files are already waiting.

Dev Kapoor covers open source software and developer communities for Buzzrag.