Nvidia Skill Spector Scans AI Agent Skills for

Picture this: you find a useful-sounding skill for your AI agent, copy the install command, and within seconds it's running inside a system that has access to your API keys, your files, and your credentials. You never read the code. Neither, effectively, did your agent—it just followed the instructions.

That's not a hypothetical attack scenario. That's the default behavior of most AI agent setups right now.

A recent study of over 30,000 AI agent skills found that more than a quarter contained a security vulnerability. About one in twenty showed signs of being outright malicious. Those numbers come from researchers cited in a new walkthrough by AI LABS, a YouTube channel that covers AI development workflows. Their video this week digs into Nvidia's newly released Skill Spector—a CLI tool designed to scan skills before installation—and works through both what the tool catches and where it falls short.

It's worth sitting with those statistics for a moment. Twenty-five percent vulnerability rate across 30,000 files is not a rounding error. This is what happens when a useful capability—shareable, composable agent instructions—spreads faster than the security habits needed to use it safely.

What a skill actually is, and why that's the problem

An AI agent skill is, at its core, a text file. Your agent reads it and treats the contents as instructions. That simplicity is what makes skills so easy to build and share. It's also what makes them an attack surface.

The AI LABS walkthrough breaks the threat taxonomy into six categories. Hidden instructions are the most insidious: malicious code tucked inside comments, encoded using invisible Unicode characters, or scrambled into text that looks like noise to a human but parses cleanly for an AI. Impersonation attacks rename a malicious tool to match one your agent already trusts—swapping in a look-alike character from another alphabet so that what appears to be read is actually rеad, the second character being Cyrillic. Credential theft involves a skill quietly harvesting saved API keys and passwords and shipping them to an external server. Malware deployment can include a reverse shell—handing someone remote access to your machine. Poisoned dependencies pull in packages with names one typo off from legitimate ones.

And then there's the sixth category, which is the subtle one: skills that simply lie about what they do.

"It calls itself a simple formatter and then quietly reaches out to the internet in the background," the AI LABS walkthrough explains. "Or it says it only needs permission to read your files, but the code is actually writing files and running commands, too."

This category is qualitatively different from the others. The first five involve detectable patterns—known malware signatures, Unicode anomalies, suspicious package names. A scanner can match those against databases and flag them. A skill that accurately describes its permissions but just happens to also exfiltrate data is doing something harder to catch: it requires understanding intent from code, not just pattern-matching against a library of known bads.

What Skill Spector actually does

Nvidia's tool runs as a command-line scanner. You point it at a skill file, it scores the skill's danger level from 0 to 100, and it tells you exactly which file and line number drove the score up. For the first five threat categories, it operates through static analysis—matching patterns, checking character identities, querying a live database of known malicious packages.

The AI LABS team tested it against Nvidia's own included test skills (a useful practice—the repo ships with deliberately malicious skills so you can verify the tool works before trusting it on real ones). The scanner flagged every dangerous test skill and explained why.

But static analysis has a structural limitation: it generates false positives, and it cannot evaluate intent. A skill that does something unusual isn't necessarily malicious. A skill that looks clean by every pattern-matching criterion might still be doing something it's not advertising.

That's what Skill Spector's second scan mode is for. It runs an AI-powered analysis that reads the skill's description against its actual behavior and flags the gap. The problem is that this mode is off by default, and when you turn it on, the tool expects an OpenAI API key—meaning it costs money to use, and most users probably never enable it.

The demonstration in the walkthrough makes the stakes concrete: one test skill scored zero in pattern-matching mode (perfectly safe) and jumped to 100 the moment the AI scan ran. Without that second mode, the skill would have installed cleanly.

The workaround, and what it reveals

The AI LABS team found a way around the OpenAI key requirement: swap in Claude Code's headless mode to run the AI analysis instead. Headless mode is Claude Code running in the background without a chat interface, executing commands autonomously. Anthropic includes monthly credits with its plans, so for many users this costs nothing extra. Swapping the backend is a single-line code change that Claude Code can make for you.

It works. And it highlights something interesting about the current AI tooling landscape: the boundaries between tools are increasingly negotiable. Skill Spector was built expecting one AI backend. The community immediately found it could run on another. The underlying capability—read this code, understand what it claims to do, compare that to what it actually does—is available from multiple providers.

The more interesting move in the walkthrough is what the team built on top of the scanner. Rather than running Skill Spector manually as a separate step, they packaged it as a skill itself, integrated with skills.sh (a shared Git repository of Claude-compatible skills). The result is a single workflow: ask your agent to find skills that help with a task, have it search skills.sh, automatically scan every result before installation, and either clear or flag each one. The agent won't install anything that hasn't passed inspection.

"You're not just grabbing skills blindly off the internet," the walkthrough notes. "You have a whole process that you can kick off just by using a skill."

That's the architectural insight here. Security tooling that lives outside the workflow gets skipped. Security tooling that is the workflow gets used every time.

The gap worth watching

There's a meaningful difference between what Skill Spector catches and what it can guarantee. The pattern-matching layer is strong for known threat types. The AI analysis layer adds genuine judgment about intent. But both layers depend on the skill declaring what it is—its description, its stated permissions, its visible code.

A sufficiently sophisticated malicious skill could, in theory, pass both checks: describe itself accurately, behave normally under inspection, and activate its payload only under specific conditions. This isn't a criticism unique to Skill Spector; it's a fundamental challenge for any static analysis approach. The same problem has existed in traditional software security for decades, and it's not solved there either.

What Skill Spector addresses is the low-hanging fruit: the opportunistic attacks, the typosquatted packages, the hidden Unicode instructions, the credential harvesters that don't bother to hide. Given that roughly one in twenty skills in the wild appears malicious, that's not a small problem to solve.

The deeper question is what happens as AI agent ecosystems mature and the skills within them become more complex. The attack surface isn't static. Neither is the tooling. Nvidia shipping a scanner like this early—before the ecosystem has fully scaled—suggests at least some awareness that the window for establishing norms is open right now, and not indefinitely.

Whether the default-off AI scan mode gets flipped on by default in a future release is a small but telling indicator of how seriously that window is being taken.

Rachel "Rach" Kovacs covers cybersecurity and privacy for Buzzrag.