IBM's Bob IDE Tackles Legacy COBOL—But Should You Care?
IBM's new Bob IDE promises to modernize legacy COBOL systems with AI. A test conversion of an ATM system reveals what works—and what's familiar.
Written by AI. Mike Sullivan
April 27, 2026

Photo: Better Stack / YouTube
IBM has released Bob, an AI-powered IDE that promises to help developers wrangle legacy systems—specifically the kind of COBOL codebases that power ATMs and banking infrastructure nobody wants to touch. The folks at Better Stack put it through its paces by asking it to convert an ancient COBOL ATM system into a modern Python application. It took three minutes.
Before you get too excited, let's talk about what that actually means.
The Familiar Pattern
Every few months, someone releases a new AI coding assistant that's going to "revolutionize" development. We've seen this movie before—Claude, Gemini, Copilot, Cursor, and now Bob. IBM's pitch is that Bob is different because it focuses on "architectural governance" rather than just spitting out code snippets. Instead of a single chat window, you get modes: Ask, Code, Plan, Review. You can even create custom modes.
This is interesting, not because modes are revolutionary—they're not—but because they acknowledge something most AI coding tools pretend isn't a problem: developers need different things at different times, and conflating them creates chaos. A question isn't the same as a task isn't the same as a security audit. IBM is betting that structure matters.
The Better Stack team tested Bob's Review mode, which scans for security issues like hardcoded secrets, injection risks, and OWASP violations. When they ran it on the modernized Python code, Bob flagged multiple issues. Click a lightbulb icon, and Bob attempts to fix the problem automatically. In one case, it identified a SQLite race condition and fixed it with a single line: adding BEGIN IMMEDIATE for proper locking. Then it asked if they wanted unit tests for the fix.
Here's what's worth noticing: this isn't magic, but it's useful. The structure—findings panel, one-click fixes, test generation—makes the workflow more legible than staring at a terminal wondering what your AI agent is doing. As the video creator put it: "I honestly really like using IDEs opposed to CLIs where I don't understand what the agent is doing most of the time."
The COBOL Question
IBM has deep roots in mainframe systems, which theoretically gives Bob specialized knowledge of languages like COBOL. The test seemed to confirm this—Bob successfully converted a COBOL ATM system (an open-source repo called Z bank) into a functional Python web app with a Streamlit UI. The result worked. You could log in, perform operations, the basics functioned.
But let's be clear about what "modernizing COBOL" means in practice. This wasn't touching production banking infrastructure. This was a contained demo with hardcoded credentials and no real-world complexity. The actual code quality? The creator noted the UI "does lack a bit of design judgment"—popup text was too bright, that sort of thing. Functional, not beautiful.
What's more interesting is what happened when they ran Review mode on the original COBOL code. Bob identified eight security issues in the legacy implementation. When they tried to fix one and asked for tests, Bob responded that it couldn't add tests because "this is typical for legacy mainframe applications that rely on manual testing or mainframe specific testing tools not present in the repository."
Read that again. The AI essentially said: this code is so old it predates automated testing frameworks. Which raises an uncomfortable question—if your production COBOL system is anything like this demo, you have bigger problems than whether Bob can modernize it. You have code running critical infrastructure that was never designed to be tested, validated, or maintained by modern standards.
What IBM Isn't Saying
Every vendor presentation focuses on the success case. IBM shows you Bob converting COBOL in three minutes. What they don't show you is the 47 edge cases where the conversion breaks, the business logic that gets lost in translation, the regulatory requirements that assume COBOL's specific behaviors, or the integration points with other ancient systems.
Legacy modernization is hard not because the syntax is mysterious—though COBOL certainly has its charms—but because decades of institutional knowledge are embedded in code that nobody fully understands anymore. The developer who wrote it retired in 1997. The documentation lives in a filing cabinet in a basement. The business rules have changed sixteen times but the code only reflects fourteen of them.
Can Bob help with that? Maybe. It can probably handle straightforward conversions of well-structured code. But "well-structured legacy code" is often an oxymoron, and the kind of organization that still runs COBOL ATM systems probably has the messy kind.
The Economics You Should Understand
Bob uses a credit system: 40 "Bob coins" in the free trial, one coin equals 50 cents USD. The three-minute COBOL conversion cost about four coins, or roughly two dollars. That's cheap for a demo. For an actual migration project? Start doing the math on thousands of files, multiple iterations, edge case handling, and testing.
More importantly, IBM isn't selling you Bob because they care about your COBOL problem. They're selling it because they want to keep you in the IBM ecosystem. If Bob makes it easy to modernize your mainframe applications while keeping you on IBM infrastructure, IBM wins. If you use Bob and then migrate everything to AWS, IBM loses. Watch for where the incentives point.
What's Actually New Here
Strip away the marketing and what do you have? An IDE that wraps AI capabilities in a more structured interface than most competitors. The mode system is genuinely useful. The security review workflow is cleaner than anything I've seen in a general-purpose AI coding tool. The permission system—where you explicitly define what Bob can and can't do—acknowledges that full autonomy is often the wrong answer.
Is this revolutionary? No. Is it useful? Probably, for a specific kind of developer working on a specific kind of problem. If you're responsible for legacy enterprise systems and you need to triage security issues or prototype modernization approaches, Bob might save you time. If you're building a greenfield startup, you probably don't care.
The real test won't be demos. It'll be whether organizations with actual legacy COBOL systems—banks, insurance companies, government agencies—trust Bob enough to use it on production code. That's a very different question than whether it can convert a demo ATM repo.
For now, Bob is another tool in the expanding toolkit of AI coding assistants. Better structured than some, more specialized than others, priced like IBM always prices things—strategically. Whether it's the "next autonomous architect" or just another way to generate Python from COBOL depends entirely on what you're trying to build, and how much you trust the ghost in the machine.
—Mike Sullivan
We Watch Tech YouTube So You Don't Have To
Get the week's best tech insights, summarized and delivered to your inbox. No fluff, no spam.
Watch the Original Video
I Modernized an 80s ATM System in 3 Minutes with IBM’s Bob (Full Breakdown)
Better Stack
8m 6sAbout This Source
Better Stack
Since its launch in October 2025, the YouTube channel Better Stack has rapidly become a go-to destination for tech professionals, amassing over 91,600 subscribers. The channel positions itself as a cost-effective alternative to enterprise solutions, focusing on software development, AI applications, and cybersecurity.
Read full source profileMore Like This
Playwright CLI vs MCP Server: The Token Usage Battle
Better Stack tests Playwright CLI against MCP Server for Claude Code. Token efficiency matters, but the real story is about what you're actually building.
Web Haptics Brings Native App Feedback to Websites
A tiny NPM package adds haptic feedback to websites using a clever iOS workaround. Better Stack walks through the implementation and the hack that makes it work.
Stoolap Promises 138x Speed Over SQLite. It Delivers 6x.
New Rust database Stoolap claims massive speed gains over SQLite. Real-world testing reveals a different story—and more interesting questions.
OpenAI Codex Now Runs AI Coding Agents While You Sleep
OpenAI Codex's new automation features let AI agents handle coding tasks on autopilot. Here's what developers actually get—and what they're giving up.
Varlock Wants to Kill Your .env Files. Should You Let It?
Varlock promises to solve environment variable chaos by eliminating plain-text secrets. We examine whether this open-source tool delivers on its claims.
This AI Second Brain Debugs Code While You Sleep
A developer built an autonomous AI system using Claude Code that finds bugs, analyzes churn, and ships fixes to dev—all without human intervention.
Vibe Coding: The AI Revolution or Another Tech Fad?
Exploring vibe coding and agentic AI: game-changer or tech deja vu?
Venice DB: A Planetary Data Odyssey
Explore Venice DB's unbundled architecture and CAP theorem insights in building planetary-scale systems.
RAG·vector embedding
2026-04-27This article is indexed as a 1536-dimensional vector for semantic retrieval. Crawlers that parse structured data can use the embedded payload below.