Edited by humans. Written by AI. How our editing works
All articles

Google's Open Knowledge Format for AI Agents

Google's Open Knowledge Format promises to fix how AI agents navigate knowledge bases. Here's what it actually does, what it doesn't, and why the structure matters more than the tool.

Marcus Chen-Ramirez

Written by AI. Marcus Chen-Ramirez

June 27, 20268 min read
Share:
Google logo emerging from a stylized brain with neural network connections and "Brain" text highlighted in yellow against a…

Photo: AI. Dexter Bloomfield

There's a specific kind of frustration that comes from building something elegant for yourself and then watching it fall apart when anyone else—or any AI agent—tries to use it. The second brain community knows this intimately. Thousands of people have spent real hours architecting personal knowledge systems in Notion, Obsidian, or plain markdown directories, only to discover that what makes sense to their own brain is effectively illegible to everyone else's, including Claude's.

Google's newly released Open Knowledge Format (OKF) is a direct attempt to address this. It's not a product you install. It's a specification—a proposed standard for how knowledge files should be structured so that AI agents can navigate them efficiently, regardless of which model or platform is doing the navigating. That distinction matters, and we'll come back to it.

The actual problem, stated plainly

The AI Labs team, who broke down OKF in a recent video, put the failure mode well: "The real problem is that Claude doesn't know the info it needs already exists in the knowledge base. It only finds things when it actively searches for them. So unless you tell it to look in a certain file, it won't even know that file is there."

This is a structural issue, not a Claude issue per se. Current AI agents navigate file systems roughly the way a new employee would navigate a filing cabinet with no labels—by opening drawers and hoping. In a small knowledge base, that's annoying but manageable. In a large, deeply nested one, the agent makes multiple attempts to find the right file, each attempt burning tokens. It misfiles new documents because it doesn't know a relevant folder already exists under a slightly different name. It creates duplicates. The cost compounds as the base grows.

The common workaround has been claude.md files: documents that tell the agent how to navigate the surrounding directory. It works, up to a point. But a claude.md file is itself a bespoke artifact built around the creator's mental model. Someone new—human or machine—still has to decode it.

Where OKF actually comes from

OKF isn't born from nowhere. It formalizes what Andrej Karpathy called the LLM Wiki pattern: using markdown files rather than vector databases as the substrate for AI memory. The contrast with Retrieval-Augmented Generation (RAG) is instructive. In a RAG system, your documents get converted into numerical vectors that capture semantic meaning. When you query, the system finds the closest matching vectors and reconstructs an answer. It's powerful for certain use cases, but as the AI Labs team notes, "Whenever you ask a question, the agent is basically rebuilding the information from scratch. It hands you an answer, but it's not building up any knowledge over time."

The LLM Wiki approach bets instead on an agent's ability to navigate a file system—to read an index, decide what's relevant, and pull only what it needs. Done well, this is more like how a researcher works: you don't re-read the entire library every time you have a question; you go to the right shelf.

OKF's contribution is to standardize that shelf-labeling system. Each folder contains an index.md that describes its contents before the agent opens anything. Each document has YAML front matter—a small metadata block at the top—that tells the agent what the file is about without requiring it to load the full content first. The core design principle is minimalism: one file, one concept. The moment a document mixes topics, the agent's ability to selectively load only what it needs starts to degrade.

The AI Labs team describes the second key principle as separating the knowledge from whoever is consuming it: "Whether it's an agent, a human, a team member, or anything else, the knowledge itself stays independent. It's not tied to any specific platform, which is what makes it usable with pretty much anything."

That's the value proposition in a sentence. A knowledge base that is legible to a new human teammate on day one and to a fresh AI agent session equally, without either needing a guided tour.

What the actual implementation revealed

The AI Labs team tested OKF against their own team knowledge base—version-controlled with Git, shared via GitHub, already battle-tested across multiple team members. A few things emerged from that process worth noting for anyone considering adopting the format.

First, the official tooling is currently coupled to Google BigQuery. If your knowledge base isn't sitting in Google's data warehouse—and most people's aren't—the enrichment agent that converts data into OKF concept documents simply doesn't apply to you. The team worked around this by building their own skill: a Claude-powered converter that takes any folder of markdown files and produces a valid OKF bundle. Code handles the mechanical transformation; the agent only steps in for judgment calls.

Second—and this is the part that will frustrate early adopters—even after converting their entire knowledge base into proper OKF structure, the agent initially ignored it. "When we first asked it to look for a file, it just defaulted to the way it normally searches by matching patterns. And that's because OKF isn't a widely adopted standard yet and only came out recently, so Claude didn't really know it existed."

The fix was manual: adding an explanation of OKF's structure directly to the claude.md file. Once that was in place, the agent started using the index files, loaded YAML metadata before opening full documents, and retrieved results faster with lower token consumption. It worked. But it only worked because they told the agent how to work it.

This is the current honest state of OKF: it's a technically sound specification that requires active configuration to deliver its benefits, because no agent natively understands it yet. The team's own framing of this is refreshingly undramatic: "Until it becomes an open standard that agents support out of the box, this is more of an optimization than something you really need."

Why Google might care beyond the obvious

There's one thread here that's worth pulling on. The AI Labs team raised it as speculation, but it's grounded speculation: Google is actively working to transform web search into something more agentic—where AI agents query and synthesize content across the web rather than returning a list of blue links. Websites are already beginning to add llms.txt files, machine-readable documents that give AI systems structured context about a site.

OKF, at web scale, could be a natural extension of that infrastructure. Instead of an AI agent trying to parse unstructured web content, it could query an OKF bundle that a website publishes alongside its regular pages—structured, semantic, fast to navigate. The AI Labs team note that "websites might eventually start adding OKF bundles, too. That would let agents query their content more efficiently."

If that's the trajectory Google is betting on, then OKF reads less as a developer productivity tool and more as early infrastructure for a different kind of web—one organized not around what humans browse but around what agents can efficiently query. Whether that's a good thing depends heavily on who controls the standard, who can afford to produce compliant bundles, and what gets excluded from agentic search because it doesn't meet the spec.

Those questions aren't answered by OKF's release. They're opened by it.

The standardization pattern

Zooming out: the AI Labs team situates OKF within a broader pattern of standardization that's been rolling through the AI agent ecosystem. MCPs standardized how agents talk to external tools. Skills standardized reusable instruction sets. design.md standardized how design intent gets communicated to agents. Now OKF attempts the same for knowledge. Each of these standards, when it works, makes the ecosystem more composable—you can mix and match tools, agents, and knowledge bases without rebuilding integrations from scratch.

That's genuinely useful. But it's also worth noting who tends to set these standards and what their interests are. Google releasing a knowledge standard while also building agentic search products is not a neutral act. It's a reasonable one—the format appears technically well-reasoned—but the question of which standard becomes dominant, and who that serves, is always partly a political question, not just a technical one.

Right now, OKF is early enough that those dynamics are still being written. For practitioners building AI systems today, the format offers real efficiency gains for knowledge-heavy workflows, provided you're willing to bridge the tooling gaps yourself. For everyone else, it's worth watching—because how AI agents learn to navigate knowledge will shape what knowledge they can actually find.


By Marcus Chen-Ramirez, Senior Technology Correspondent, Buzzrag

From the BuzzRAG Team

AI Moves Fast. We Keep You Current.

Framework breakdowns, tool comparisons, and AI coding insights — distilled from the best tech YouTube creators. Free, weekly.

Weekly digestNo spamUnsubscribe anytime

More Like This

Bold red text warns "New & Free DeepSeek Is SCARY!" alongside a blue whale logo and red arrow pointing right on white…

DeepSeek V4: Build Apps and AI Agents for Free

DeepSeek V4 lets non-coders build apps and run AI agents for free. Here's what actually works, what breaks, and what the hype leaves out.

Marcus Chen-Ramirez·2 months ago·6 min read
Google I/O session speaker presenting on AI agent development, with microphone visible in professional setting

Six Protocols That Make AI Agents Actually Work

Google's agent protocol stack—MCP, A2A, UCP, AP2, A2UI, AGUI—explained through a kitchen manager demo. What each protocol does and when to reach for it.

Marcus Chen-Ramirez·1 month ago·7 min read
Composio logo with white text and icon on black background, colorful gradient border, red mascot character in corner

Composio Wants to Be the Universal Adapter for AI Agents

Composio promises to connect AI agents to 1,000+ apps via CLI. But does abstracting integration complexity actually solve the right problem?

Marcus Chen-Ramirez·3 months ago·6 min read
Developer coding at desk with dual monitors displaying GitHub interface, surrounded by neon blue and red lighting with "34…

34 Open-Source Tools Rewriting How Developers Work With AI

From AI agents that run in isolated VMs to databases that forget like humans, these 34 projects represent a different kind of AI tooling—paranoid, practical, weird.

Marcus Chen-Ramirez·2 months ago·5 min read
Professional man in suit smiling next to software interface screenshot, with text "Agents in Production" and "Shipping is…

How OpenGov Deployed AI Agents for Local Government

OpenGov engineer Gabe De Mesa details how OG Assist brought AI agents to thousands of state and local governments—and what it actually took to make them work.

Marcus Chen-Ramirez·1 day ago·8 min read
A woman in a maroon shirt speaks to camera with code and diagrams visible on a dark background, labeled "think series:…

AI Agents in Production: What Actually Works

IBM's Shailaja Patel-Pranav breaks down why AI agents fail in production—and the coordination patterns that make them actually reliable in enterprise workflows.

Marcus Chen-Ramirez·1 week ago·7 min read
Woman at desk with AI robot and books, text overlay "Building Agents in 2026 (Major Updates!)" with lonelyoctopus logo

Open Source AI Models Just Changed Everything

The AI landscape shifted dramatically in early 2026. Open-source models now rival closed systems—but the tradeoffs matter more than the hype suggests.

Bob Reynolds·3 months ago·6 min read
A shocked man with wide eyes appears next to the Linear logo and text reading "Issue tracking is dead" against a dark…

Linear Says Issue Tracking Is Dead. Here's What's Next

Linear, the issue tracker beloved by engineers, just declared its own category obsolete. AI agents are changing how software gets built—for better or worse.

Tyler Nakamura·3 months ago·6 min read

RAG·vector embedding

2026-06-27
1,928 tokens1536-dimmodel text-embedding-3-small

This article is indexed as a 1536-dimensional vector for semantic retrieval. Crawlers that parse structured data can use the embedded payload below.