Edited by humans. Written by AI. How our editing works
BUZZRAGNews. Trends. Ideas — distilled in minutes.
All articles

AI Pair Programming: Productivity Tool or Security Risk?

AI pair programming promises faster code and fewer bugs. But what happens when your AI collaborator is confidently wrong about security? A practical read for developers.

Rachel "Rach" Kovacs

Written by AI. Rachel "Rach" Kovacs

June 16, 20266 min read
Share:
Person wearing glasses against dark background with purple code diagram and "think series" branding, discussing AI pair…

Photo: AI. Soraya Hadid

The most interesting thing Sam Anthony says in IBM's recent explainer on AI pair programming isn't in the pitch. It's buried near the end, almost as a footnote: "AI can be very confidently wrong, especially when it's not an expert in your business context."

That sentence is doing a lot of work. And if you're a developer writing code that touches authentication, payment flows, or anything a bad actor might want to get at — it should give you pause before you accept the next autocomplete suggestion.

I cover security, not developer productivity. But those two things aren't separate anymore, and that's the real story here.

What IBM is actually selling

The video frames AI pair programming as a natural evolution of human pair programming — two heads better than one, except now one head never sleeps, never needs to vent about the sprint planning meeting, and can generate test cases while you grab coffee. Anthony's pitch is genuinely coherent: AI handles the tedious parts of the development inner loop (context-switching, documentation, boilerplate), developers stay in the driver's seat on judgment calls, and the result is faster cycles with fewer blockers.

The productivity case is reasonable. The colleague-or-tool debate over how to classify these systems is interesting, but it's mostly academic when your deadline is Friday. What matters is whether the output is correct — and for security-sensitive code, correct means something more specific than "compiles and passes unit tests."

Human pair programming does have a research base behind it — Laurie Williams' foundational work suggested it catches bugs earlier and produces more maintainable code — but later meta-analyses have been more qualified, particularly for experienced developers. The evidence is real but not the slam dunk that productivity advocates tend to imply. AI pair programming inherits both the genuine benefits and the open questions.

The part the video skips

Here's what a developer productivity video from IBM is not going to spend time on: the security surface that AI-generated code creates.

When an AI coding assistant suggests an implementation, it's drawing on patterns from its training data. For well-documented tools like GitHub Copilot, that training includes large volumes of public code. Other tools — Cursor, Amazon Q, others — have less publicly documented training data mixes, so the precise sourcing varies. What doesn't vary is the underlying dynamic: the model has seen a lot of code, including a lot of insecure code, because insecure code is what most public repositories contain. The internet is not a curated security curriculum.

Research has found that AI-generated code can reproduce known vulnerable patterns — buffer overflows, SQL injection setups, insecure random number generation — with the same fluency it reproduces everything else. The model doesn't know it's doing this. It's not being malicious. It's completing patterns. That's the problem.

Anthony's warning about "confident wrongness" is framed as a productivity issue: don't blindly accept AI output because it might be logically incorrect. But confident wrongness in a CRUD app is a bug. Confident wrongness in an authentication module is a CVE. The stakes aren't the same, and treating them as the same category of risk is where development teams can get into trouble.

This connects directly to what Anthropic found in their own research on AI and developer skill — the story is more complicated than "AI makes you better." There are real questions about whether heavy reliance on AI-generated code erodes the diagnostic instincts that let experienced developers recognize a subtly broken implementation when they see one.

Skill atrophy is the actual argument

I want to spend a moment on what I think is the most honest tension in Anthony's framing, because he's right about it even if he undersells the implications.

"Less time is spent writing code from scratch," he says, "and more time is spent outlining problems, designing systems, and evaluating the quality of solutions."

That's the optimistic version of a real shift. The less optimistic version: if a generation of developers writes significantly less code from scratch, they may become less equipped to evaluate the quality of solutions — because that evaluation skill is built by writing and debugging code yourself, failing at it, understanding why it failed. You can't audit code you don't fully understand, and you can't develop deep understanding primarily through review.

This isn't a theoretical risk. It's how skill atrophy works in every technical domain. The spreadsheet analogy gets invoked a lot in these conversations — calculators didn't kill mathematicians, spreadsheets didn't kill accountants — but the spreadsheet comparison is actually messier than it looks. There's documented evidence that the introduction of spreadsheet software did reduce accounting clerk employment meaningfully through the 1980s and '90s. It's not a clean parallel for "skills shift, not job loss." The cleaner honest statement is: the role changes, some people adapt, some don't, and the skills that matter shift in ways that aren't always predictable in advance.

What to actually do with this

Anthony's bottom line is that active engagement is non-negotiable: "If you blindly accept everything AI produces, you're not really collaborating." Fair. But "don't be passive" is advice, not a framework. Here's what I'd actually want a developer to think through before leaning into AI pair programming at work:

What kind of code am I generating? The risk profile for AI-assisted boilerplate is very different from AI-assisted authentication logic or cryptographic implementation. Know which is which before you accept suggestions.

Does my review process account for the specific failure modes of AI-generated code? That means looking for overconfident implementations of security patterns, checking whether suggested dependencies have known vulnerabilities, and not assuming that "it looks right" means "it is right" — because AI output is optimized to look right.

What does my security team know about the tools we're using? This is a question many development teams aren't asking yet. If AI coding assistants are generating code that touches sensitive data flows, your security posture should account for that. The question of what data those tools send back to their servers is also worth asking explicitly — some tools have clearer data handling policies than others.

Am I building the skill or borrowing it? There's a difference between using AI to accelerate work you understand deeply enough to evaluate, and using it to produce work you couldn't produce or critique yourself. The first is a productivity tool. The second is a dependency.

None of this means AI pair programming is a bad idea. The productivity gains Anthony describes are real and the framing — AI accelerates the loop, humans make the calls — is the right one. But "humans make the calls" only works if the humans are equipped to make them. Right now, that's less a given than the IBM video implies.

The promise of AI pair programming is that it makes developers more capable of tackling bigger problems. The open question is whether it's building that capability or borrowing against it.


Rachel "Rach" Kovacs is Buzzrag's cybersecurity and privacy correspondent.

From the BuzzRAG Team

AI Moves Fast. We Keep You Current.

Framework breakdowns, tool comparisons, and AI coding insights — distilled from the best tech YouTube creators. Free, weekly.

Weekly digestNo spamUnsubscribe anytime

More Like This

Man pointing at messy code with "DEVS AREN'T READY" text above, dark background

Cline CLI 2.0: Open-Source AI Coding Tool Goes Terminal

Cline CLI 2.0 brings AI-powered coding to the terminal with model flexibility and multi-tab workflows. But open-source AI tools raise questions.

Samira Barnes·4 months ago·7 min read
Man in black shirt against beige background with text about AI assistance and coding skills, dated Jan 29 2026

That Anthropic Study on AI and Coding? It's Complicated

Anthropic's study says AI makes developers worse at coding. But the methodology reveals a more nuanced story about junior devs and unfamiliar tools.

Tyler Nakamura·4 months ago·5 min read
Man in glasses gesturing before digital diagrams with "Don't Get Fired!" text overlay and glowing figures background from…

Five Ways AI Can End Your Career at Work

Shadow AI, hallucination laundering, zombie agents—IBM's Martin Keen maps the AI workplace risks that have already cost people their jobs. Here's what they actually mean.

Marcus Chen-Ramirez·3 weeks ago·7 min read
Pixelated character next to Anthropic logo with bold orange and white text reading "CLAUDE BART MODE" on dark background…

Traycer's Bart Mode: When AI Agents Stop Needing Babysitters

Traycer's new Bart Mode promises autonomous AI coding that actually works. We examine whether spec-driven orchestration solves the babysitting problem.

Mike Sullivan·2 months ago·6 min read
A chat interface and code editor display "FULLY AUTO AI CODER!!!" with a progress bar showing 64% completion and "Super…

Verdant Manager Promises an AI CTO—Read the Fine Print

Verdant Manager wants to be your AI CTO. The workflow pitch is genuinely interesting. The security questions it doesn't answer are more interesting.

Rachel "Rach" Kovacs·1 month ago·8 min read
Speaker presenting at AI Engineer Europe conference with slide comparing Deep Modules vs Shallow Modules, with "Code isn't…

AI Coding Tools Work Best With Old Engineering Practices

Developer educator Matt Pocock argues AI coding assistants amplify code quality issues. His solution? Decades-old software fundamentals matter more than ever.

Dev Kapoor·2 months ago·7 min read
A pink brain illustration next to blue and pink text reading "New research: AI mind-control" on a dark background with…

The 'Rhinehart Effect': How AI Dependency Works

Dr. Jonas Birch argues AI creates dependency through three stages. But is this 'mind control' framework accurate, or does it miss what's actually happening?

Rachel "Rach" Kovacs·3 months ago·6 min read
MacBook Pro with A18 chip displaying Blender 3D rendering of a colorful robotic scene with orange and blue machinery

MacBook Neo's A18 Pro Chip Hits a Wall in Blender Testing

Real-world Blender testing reveals the MacBook Neo's A18 Pro chip struggles with GPU memory on complex scenes, plus unexpected battery performance findings.

Rachel "Rach" Kovacs·3 months ago·5 min read

RAG·vector embedding

2026-06-16
1,586 tokens1536-dimmodel text-embedding-3-small

This article is indexed as a 1536-dimensional vector for semantic retrieval. Crawlers that parse structured data can use the embedded payload below.