AI Agents Learn Procedural Knowledge Through Skills
AI agents know facts but lack procedural knowledge. Skills—simple markdown files—teach them workflows and judgment. Here's how the standard works.
Written by AI. Bob Reynolds

Photo: IBM Technology / YouTube
Large language models can recite the history of SQL and explain Kubernetes architecture. They know facts. What they don't know is the 47-step workflow for generating a compliant financial report, or the specific sequence your company uses to onboard a new client, or the judgment calls that separate a competent employee from someone who just looked up the manual.
That gap between knowing things and knowing how to do things has created a market for something called agent skills. IBM Technology's Martin Keen recently walked through how they work, and the explanation is worth examining because skills represent a specific bet about how AI agents will become useful in practice.
The Problem Skills Address
The issue is straightforward. An AI agent running a large language model has two bad options when it encounters a task requiring specific procedural knowledge. Either someone needs to prompt it with every single step—every time—or the agent guesses. Neither scales.
"An AI agent that is running a large language model when it encounters a task like generating this report, it basically has two options," Keen explains. "Either somebody needs to prompt it with every single step, all 47 of them, and they need to do that every time, or worse still, the agent is just going to take a guess at it."
Skills are the proposed solution. They're markdown files that contain instructions, formatted in a way that agents can load selectively. The architecture is simple enough to describe in a paragraph: YAML front matter with a name and description, a body section with the actual instructions, and optional folders for scripts, references, and assets.
The simplicity is deliberate. The skill.md format is an open standard published at agentskills.io under an Apache 2.0 license, adopted by Claude, OpenAI Codex, and other platforms. A skill built for one works on another.
Progressive Disclosure and the Token Budget
The clever part is how skills load. If an agent has hundreds of skills available, loading them all at startup would exhaust the context window before any real work began. Skills use what's called progressive disclosure across three tiers.
Tier one loads only metadata—the name and description from each skill. Even with 100 skills installed, that's a handful of tokens per skill. The agent maintains what amounts to a table of contents of everything it can do.
Tier two loads the full instructions when the agent encounters a request matching a skill's description. The LLM's own reasoning handles this matching, which is why the description field matters. Write "use this when the user asks to extract a PDF" and the model decides when that applies.
Tier three loads resources—scripts, references, assets—only when a specific task needs them.
This architecture separates knowing about a capability from knowing how to execute it from having the resources to do so. It's efficient, but it also creates dependency on the LLM's judgment about when to invoke which skill. That judgment varies by model, by prompt, by the specific wording of a request.
How Skills Relate to Other Knowledge Types
Keen draws a useful distinction between knowledge types. Model Context Protocol (MCP) provides tool access—the ability to call external APIs. Retrieval-augmented generation (RAG) handles factual knowledge by pulling relevant chunks from databases. Fine-tuning bakes knowledge directly into model weights, which is permanent but expensive and model-specific.
Skills handle procedural knowledge. "Skills handle, as I mentioned right up front, procedural knowledge," Keen notes. "It's how to do things in what order and with what judgment."
The cognitive science analogy is semantic memory (facts), episodic memory (experiences), and procedural memory (skills). Agent architectures increasingly mirror this structure. RAG covers semantic memory. Conversational logs cover episodic memory. Skills cover procedural memory.
In practice, these knowledge types combine. MCP provides the capability to invoke something externally; the skill provides judgment about when and how to do it.
The Security Problem
Skills can include executable scripts with access to file systems, environment variables, and API keys. That's what makes them powerful. That's also what makes them dangerous.
Keen is direct about this: "Audits have found publicly available skills frequently contain bad stuff like prompt injection, bad stuff like tool poisoning, bad stuff like hidden malware. Basically, the usual suspects for any open ecosystem."
The advice is sensible—treat skill installation like any software dependency, which means reviewing and understanding what it does before running it. But that advice assumes technical competence and time that many users won't have. The open ecosystem that makes skills portable and reusable also makes them a vector for the same attacks that plague any package repository.
This isn't unique to skills. It's the fundamental tension in any system that combines ease of use with executable code. The markdown format makes skills accessible. The executable scripts make them capable. Security requires users to exercise judgment about what they install, which works until it doesn't.
What This Means for Agent Architecture
The emergence of agent skills as a standard suggests where the industry thinks AI agents need to go. Not toward models that know more facts—that's covered. Not toward models with longer context windows, though that helps. Toward models that can follow organizational procedures, adapt to specific workflows, and make the judgment calls that constitute actual work.
Whether skills are the right abstraction for procedural knowledge remains an open question. The format is simple, which is an advantage until you encounter something complex that doesn't map cleanly to markdown and optional scripts. The standard is open, which encourages adoption until competing standards emerge or the major platforms decide they'd rather control the ecosystem.
What's clear is that the gap between "knows things" and "knows how to do things" matters more as AI agents move from demos to deployment. Skills are one attempt to bridge it. They won't be the last.
Bob Reynolds is Senior Technology Correspondent for Buzzrag.
AI Moves Fast. We Keep You Current.
Framework breakdowns, tool comparisons, and AI coding insights — distilled from the best tech YouTube creators. Free, weekly.
More Like This
Why Natural Language Is Now the Most Important Code
After 50 years of programming evolution, computers finally understand us. IBM's Jeff Crume explains why English beats Python in the AI era.
Agent Zero's New Skills Feature Makes AI Dangerously Easy
Agent Zero's latest update lets anyone teach AI agents new tricks in minutes. The demo is impressive. The security warnings? Even more so.
AI Agent Skills: The Markdown Files That Teach Once
Skills are markdown files that give AI agents context on demand—solving the problem of repeating instructions without overloading context windows.
Claude's Loop Feature Isn't What the Hype Suggests
Anthropic's new loop skill for Claude Code has developers excited, but they're misunderstanding its purpose. Here's what it actually does.
Skills.sh Wants to Be NPM for Your AI Coding Agent
Vercel's Skills Night reveals how skills.sh reached 4M installs by solving a problem nobody knew they had: distributing context to AI coding agents.
35 Claude Skills on GitHub Turn AI Coding Assistants Into Experts
Developers are building specialized skills that transform Claude and other AI coding assistants into domain experts. Here's what's actually worth using.
Apple's $599 MacBook Neo: A Decade-Late Victory Lap
Apple finally built the affordable MacBook it tried to make in 2015. The difference? This time the technology actually works as promised.
What the M5 MacBook Air Actually Means for 3D Artists
Tech YouTuber Adam breaks down the M5 MacBook Air for 3D work. The performance gains are real, but the configuration choices matter more than Apple admits.
RAG·vector embedding
2026-04-20This article is indexed as a 1536-dimensional vector for semantic retrieval. Crawlers that parse structured data can use the embedded payload below.