Edited by humans. Written by AI. How our editing works
BUZZRAGNews. Trends. Ideas — distilled in minutes.
All articles

When Your AI Agent Should Actually Be a Workflow

Most AI 'agents' should be workflows instead. A technical workshop reveals why autonomy isn't always better—and how to choose the right architecture.

Bob Reynolds

Written by AI. Bob Reynolds

April 21, 20267 min read
Share:
Three presenters stand before a whiteboard with AI architecture diagrams, with overlaid text reading "AI Engineer Europe…

Photo: AI Engineer / YouTube

LinkedIn's AI slop problem has a name now. Louis-François Bouchard calls it out in the opening minutes of a recent workshop on building deep research agents: the meaningless phrases, the hallucinated statistics, the "rapidly evolving" clichés that signal a ChatGPT draft nobody bothered to edit. His slide shows real examples—posts claiming "most teams" do something or citing GPT-4 as state-of-the-art weeks after it isn't. The tells are everywhere once you know to look.

But Bouchard, CTO of Towards AI, isn't just complaining about bad content. He's explaining why automating research and writing is harder than it looks, and why most people are solving it wrong.

The workshop—a two-hour deep dive into building MCP-powered research agents—starts with a framework that matters more than the code. Before you build anything, Bouchard argues, you need to understand the difference between a workflow and an agent. Most people skip this step. They hear "AI agent" and start assembling multi-agent systems for problems that need three sequential function calls.

"Most of these agents that our clients want are actually somewhat super simple workflows," Bouchard says, "or at least workflows that we can come up with pretty easily."

The distinction isn't academic. It's the difference between a system that costs $0.02 per task and one that costs $2. Between something that works reliably and something that's impressive in demos but fails in production.

The Autonomy Slider

Bouchard describes AI engineering as navigating an "autonomy slider"—a spectrum from simple prompts to full agentic systems. At the low end, you have basic prompt engineering. In the middle, workflows with predetermined steps. At the high end, agents that can react to their environment and make autonomous decisions.

The trap is assuming more autonomy is always better. It isn't. Every step up that slider adds cost, reduces control, and introduces new failure modes.

Start with the simplest solution. If the model already knows enough to answer the question, just prompt it. Add few-shot examples if needed. That's it. If you need external context and it fits in 200,000 tokens, paste it in and use context caching. Still not an agent.

If context needs to be retrieved dynamically—because it's private, recent, or domain-specific—inject it on the fly. Still a workflow. Chain prompts together, add routing logic, run steps in parallel, even add feedback loops with a judge. All workflow territory.

You cross into agent land when the system needs to react to what happens in its environment. When it must decide which tools to use—and which not to use—based on what it discovers. When branching is dynamic rather than predetermined.

Bouchard's team built a support ticket system for a client: receive ticket, classify, route to team, draft response, validate against policy, send. Six steps, always the same order. "Building this as an agent would just add overhead without adding anything extra," he explains. They built a workflow.

Another client, a Canadian CRM platform, wanted a multi-agent system to generate marketing content. They were applying for an AI grant and thought agents sounded impressive. After talking through the actual requirements, Bouchard's team built a single agent with specialized tools instead. Same capabilities, fraction of the complexity.

"We use tools as specialists but the global context stays within our only agent," he says. Each tool can have its own system prompt, validation logic, even its own LLM calls. But keeping one decision-maker prevents the information loss and error propagation that comes from splitting logic across multiple agents.

The Context Budget Problem

Here's where it gets interesting. Even with a single agent, you hit a constraint: the context window. Not the advertised 1 million token limit—the practical limit where performance degrades.

"Context rot" starts around 200,000 tokens, Bouchard notes, well before you hit the model's technical ceiling. This traces back to how long-context models are trained: by inserting random facts into large corpuses and testing retrieval. They learn to find specific needles, not to synthesize entire haystacks.

Your context budget includes everything: system prompt, tool definitions, few-shot examples, retrieved data, conversation history. As tasks progress, context grows and performance declines. You need strategies to keep it lean.

Trim content. Summarize. Retrieve selectively. Or—and this is where multi-agent systems finally make sense—delegate to sub-agents with their own context windows. Not because agents are cool, but because you've genuinely outgrown a single context.

Bouchard suggests multi-agent architectures when you have over 20 tools, when context becomes unmanageable, or when you need autonomous decision-making across distinct domains. Also for compliance reasons—one agent per hospital in healthcare, for instance, to keep data local.

Deep Research as Training Ground

This framework leads to the workshop's main project: a deep research agent. Bouchard calls these systems "one of the best projects to learn how to build such a complex end-to-end system" because they integrate every technique he's described.

Deep research agents plan their own research strategy. They search, inspect sources, pivot based on findings, and synthesize information. They're goal-driven—you tell them what to research, not how. They cite sources, incorporate feedback loops, and iterate until they have something useful.

Towards AI built theirs out of necessity. Creating technical courses and articles requires senior AI engineers who can both build systems and explain them. That's expensive expertise deployed in a not-particularly-lucrative task. Automating the research and first-draft writing frees those engineers for work that requires human judgment: storytelling, pedagogy, the choices that make content actually useful.

The system takes a topic, researches it thoroughly via web search and tool use, then feeds findings into a separate technical writing workflow. Note the architecture: agentic for research (inherently exploratory), constrained workflow for writing (benefits from tighter structure and review loops).

"Research and writing require much different architectures," the workshop description emphasizes. This isn't obvious until you've built both and seen what works.

The team used their system to build a course teaching others to build the same system. Recursive, useful, and a forcing function for iteration based on student feedback.

What Actually Matters

Five decades covering technology teaches you to look past the architecture diagrams to the constraints that actually govern decisions. In AI engineering, those constraints are cost per task, latency requirements, quality thresholds, and data privacy. The stack—prompt engineering, RAG, tool use, orchestration, evaluation—exists to navigate those constraints.

Bouchard frames this clearly because Towards AI operates as both builder and educator. They can't afford to over-engineer for clients, and they can't teach techniques that don't work in production. The workshop distills what survived contact with reality.

Most AI products, he notes, aren't purely workflows or purely agentic. They combine both, using the right architecture for each component. Understanding when to use which approach—that's the skill that matters. Not whether you can say you built an agent.

The research agent workshop includes code walkthroughs, MCP server setup, tool implementation, live demos, and sections on observability and evaluation with LLM judges. Practical stuff, built by people who use it. The GitHub repository is public.

But the part worth paying attention to comes in those first thirty minutes before the code. That's where Bouchard maps the terrain so you can navigate it yourself. The autonomy slider, the context budget, the questions to ask before building anything. These determine whether your impressive demo becomes a system people actually use.

—Bob Reynolds, Senior Technology Correspondent

From the BuzzRAG Team

AI Moves Fast. We Keep You Current.

Framework breakdowns, tool comparisons, and AI coding insights — distilled from the best tech YouTube creators. Free, weekly.

Weekly digestNo spamUnsubscribe anytime

More Like This

Man in glasses pointing at glowing green Nvidia logo with robotic hands and "It's Over!" text on black background

Nvidia's GTC 2026: What 40 Million Times More Compute Means

Jensen Huang unveiled Vera Rubin chips, enterprise AI agents, and orbital data centers at GTC 2026. Here's what actually matters for the rest of us.

Bob Reynolds·3 months ago·7 min read
Professional man in glasses and blue shirt against blue geometric background with text "Demos Don't Scale. Agents Do." and…

Amazon Built AI Agents for Millions. Here's What Actually Works

Amazon's AI Product Leader shares hard-won lessons from building multi-agent systems serving millions. Spoiler: human oversight isn't a failure mode.

Mike Sullivan·2 months ago·5 min read
Man in beanie and glasses with surprised expression stands between rusty industrial machinery on left and glowing blue tech…

The Four Types of AI Agents Companies Actually Use

Most companies misunderstand AI agents. Here's the taxonomy that matters: coding harnesses, dark factories, auto research, and orchestration frameworks.

Samira Barnes·2 months ago·6 min read
Malte Ubl from Vercel discusses AI agents as the new application layer, with statistics showing 60% of Vercel page views…

Why AI Won't Kill Engineering Jobs (It'll Create More)

Vercel's CTO explains why AI agents will increase demand for software engineers, not replace them—and what types of automation actually work today.

Tyler Nakamura·2 months ago·5 min read
Google Cloud tutorial on building a RAG Agent with ADK and Dataflow, featuring two instructors against a red gradient…

Building Production RAG Systems: What Google Taught Me

Google Cloud engineers walk through building a production-ready RAG agent, revealing the gap between demo code and systems that actually ship.

Bob Reynolds·2 months ago·5 min read
Bold yellow and white title with a pixel art character and checkmark icons flanking it against a dark background

AI Coding Agents Need Structure, Not Just Speed

Claude Code can accelerate development, but without proper setup—PRDs, constraints, testing frameworks—AI-generated apps fail at scale. Here's the infrastructure.

Marcus Chen-Ramirez·2 months ago·7 min read
A presenter on stage introduces GPT 5.4 Pro, with a futuristic white and green robot head displayed on the left and glowing…

GPT-5.4 Pro Costs $180 Per Million Tokens—And Beats Google at Its Game

OpenAI's GPT-5.4 Pro outperforms competitors on new benchmarks, but at a steep price. What the latest AI model tells us about the real race.

Bob Reynolds·3 months ago·5 min read
Man holding a vibrant yellow-green laptop with colorful gradient display against pink background, "$599" text visible on…

Apple's $599 MacBook Neo: A Decade-Late Victory Lap

Apple finally built the affordable MacBook it tried to make in 2015. The difference? This time the technology actually works as promised.

Bob Reynolds·3 months ago·5 min read

RAG·vector embedding

2026-04-21
1,637 tokens1536-dimmodel text-embedding-3-small

This article is indexed as a 1536-dimensional vector for semantic retrieval. Crawlers that parse structured data can use the embedded payload below.