When Your AI Agent Should Actually Be a Workflow

LinkedIn's AI slop problem has a name now. Louis-François Bouchard calls it out in the opening minutes of a recent workshop on building deep research agents: the meaningless phrases, the hallucinated statistics, the "rapidly evolving" clichés that signal a ChatGPT draft nobody bothered to edit. His slide shows real examples—posts claiming "most teams" do something or citing GPT-4 as state-of-the-art weeks after it isn't. The tells are everywhere once you know to look.

But Bouchard, CTO of Towards AI, isn't just complaining about bad content. He's explaining why automating research and writing is harder than it looks, and why most people are solving it wrong.

The workshop—a two-hour deep dive into building MCP-powered research agents—starts with a framework that matters more than the code. Before you build anything, Bouchard argues, you need to understand the difference between a workflow and an agent. Most people skip this step. They hear "AI agent" and start assembling multi-agent systems for problems that need three sequential function calls.

"Most of these agents that our clients want are actually somewhat super simple workflows," Bouchard says, "or at least workflows that we can come up with pretty easily."

The distinction isn't academic. It's the difference between a system that costs $0.02 per task and one that costs $2. Between something that works reliably and something that's impressive in demos but fails in production.

The Autonomy Slider

Bouchard describes AI engineering as navigating an "autonomy slider"—a spectrum from simple prompts to full agentic systems. At the low end, you have basic prompt engineering. In the middle, workflows with predetermined steps. At the high end, agents that can react to their environment and make autonomous decisions.

The trap is assuming more autonomy is always better. It isn't. Every step up that slider adds cost, reduces control, and introduces new failure modes.

Start with the simplest solution. If the model already knows enough to answer the question, just prompt it. Add few-shot examples if needed. That's it. If you need external context and it fits in 200,000 tokens, paste it in and use context caching. Still not an agent.

If context needs to be retrieved dynamically—because it's private, recent, or domain-specific—inject it on the fly. Still a workflow. Chain prompts together, add routing logic, run steps in parallel, even add feedback loops with a judge. All workflow territory.

You cross into agent land when the system needs to react to what happens in its environment. When it must decide which tools to use—and which not to use—based on what it discovers. When branching is dynamic rather than predetermined.

Bouchard's team built a support ticket system for a client: receive ticket, classify, route to team, draft response, validate against policy, send. Six steps, always the same order. "Building this as an agent would just add overhead without adding anything extra," he explains. They built a workflow.

Another client, a Canadian CRM platform, wanted a multi-agent system to generate marketing content. They were applying for an AI grant and thought agents sounded impressive. After talking through the actual requirements, Bouchard's team built a single agent with specialized tools instead. Same capabilities, fraction of the complexity.

"We use tools as specialists but the global context stays within our only agent," he says. Each tool can have its own system prompt, validation logic, even its own LLM calls. But keeping one decision-maker prevents the information loss and error propagation that comes from splitting logic across multiple agents.

The Context Budget Problem

Here's where it gets interesting. Even with a single agent, you hit a constraint: the context window. Not the advertised 1 million token limit—the practical limit where performance degrades.

"Context rot" starts around 200,000 tokens, Bouchard notes, well before you hit the model's technical ceiling. This traces back to how long-context models are trained: by inserting random facts into large corpuses and testing retrieval. They learn to find specific needles, not to synthesize entire haystacks.

Your context budget includes everything: system prompt, tool definitions, few-shot examples, retrieved data, conversation history. As tasks progress, context grows and performance declines. You need strategies to keep it lean.

Trim content. Summarize. Retrieve selectively. Or—and this is where multi-agent systems finally make sense—delegate to sub-agents with their own context windows. Not because agents are cool, but because you've genuinely outgrown a single context.

Bouchard suggests multi-agent architectures when you have over 20 tools, when context becomes unmanageable, or when you need autonomous decision-making across distinct domains. Also for compliance reasons—one agent per hospital in healthcare, for instance, to keep data local.

Deep Research as Training Ground

This framework leads to the workshop's main project: a deep research agent. Bouchard calls these systems "one of the best projects to learn how to build such a complex end-to-end system" because they integrate every technique he's described.

Deep research agents plan their own research strategy. They search, inspect sources, pivot based on findings, and synthesize information. They're goal-driven—you tell them what to research, not how. They cite sources, incorporate feedback loops, and iterate until they have something useful.

Towards AI built theirs out of necessity. Creating technical courses and articles requires senior AI engineers who can both build systems and explain them. That's expensive expertise deployed in a not-particularly-lucrative task. Automating the research and first-draft writing frees those engineers for work that requires human judgment: storytelling, pedagogy, the choices that make content actually useful.

The system takes a topic, researches it thoroughly via web search and tool use, then feeds findings into a separate technical writing workflow. Note the architecture: agentic for research (inherently exploratory), constrained workflow for writing (benefits from tighter structure and review loops).

"Research and writing require much different architectures," the workshop description emphasizes. This isn't obvious until you've built both and seen what works.

The team used their system to build a course teaching others to build the same system. Recursive, useful, and a forcing function for iteration based on student feedback.

What Actually Matters

Five decades covering technology teaches you to look past the architecture diagrams to the constraints that actually govern decisions. In AI engineering, those constraints are cost per task, latency requirements, quality thresholds, and data privacy. The stack—prompt engineering, RAG, tool use, orchestration, evaluation—exists to navigate those constraints.

Bouchard frames this clearly because Towards AI operates as both builder and educator. They can't afford to over-engineer for clients, and they can't teach techniques that don't work in production. The workshop distills what survived contact with reality.

Most AI products, he notes, aren't purely workflows or purely agentic. They combine both, using the right architecture for each component. Understanding when to use which approach—that's the skill that matters. Not whether you can say you built an agent.

The research agent workshop includes code walkthroughs, MCP server setup, tool implementation, live demos, and sections on observability and evaluation with LLM judges. Practical stuff, built by people who use it. The GitHub repository is public.

But the part worth paying attention to comes in those first thirty minutes before the code. That's where Bouchard maps the terrain so you can navigate it yourself. The autonomy slider, the context budget, the questions to ask before building anything. These determine whether your impressive demo becomes a system people actually use.

—Bob Reynolds, Senior Technology Correspondent