OpenAI's Codex Desktop App Launches With Curious Bugs

OpenAI just shipped a desktop app for Codex, their AI coding assistant, and the reception from early testers is... complicated. The new macOS app represents Codex's evolution from terminal commands and VS Code extensions into a standalone GUI—which sounds great in theory. But according to AICodeKing's hands-on review, the execution raises some genuinely puzzling questions about what happened between concept and launch.

Here's what caught my attention: this isn't some scrappy startup's MVP. This is OpenAI, a company valued in the hundreds of billions, releasing software that feels, in the reviewer's words, "vibecoded"—a term usually reserved for indie projects thrown together with more enthusiasm than polish.

What the App Actually Does

The Codex desktop app brings some genuinely useful features to Mac users. It runs in the background without the lag issues that plagued earlier versions. Free users get a month of access, while paid subscribers get 2x rate limits. The app introduces "Skills" (pre-configured capabilities you can enable) and "Automations" (scheduled tasks that run without human intervention).

The interface mimics ChatGPT's familiar design language—clean, approachable, with that distinctive OpenAI color palette. On the left: threads and menu options. In the center: your workspace. It's supposed to let you schedule tasks, manage projects across multiple repos, and generally let an AI agent handle the repetitive parts of coding.

AICodeKing notes the visual similarity to existing tools: "If I were not to tell you anything, then you could basically confuse this for conductor. Conductor basically looks almost exactly the same." That observation matters more than it might seem—we'll come back to it.

The Bugs Are... Interesting

Let's catalog what's not working, because the specific failures tell us something.

First: model selection appears completely broken. The reviewer couldn't select any model except GPT-5.2 Codex Medium. Clicking other options just... doesn't do anything. No error message, no feedback, nothing.

Second: context awareness is inconsistent in ways that feel fundamental. The app initially refused to read files from an open project folder, claiming it needed context despite being told exactly where to look. "Like it literally says that this thread is made within that folder and it is not working," AICodeKing points out. "Like this is really a program shipped by a trillion dollar company."

The fix? Close everything, delete config files, restart. Then it worked. That's the kind of solution you accept from beta software, not production releases from major tech companies.

Third: the UI has some genuinely baffling design choices. The "plus" icon—which in approximately 99% of applications means "attach file"—here opens a menu where you can attach files or enable plan mode. The reviewer challenges viewers to guess where they'd click to enable planning: "Do you have your guesses? Well, I am sure you're wrong."

Then there's the file-clicking behavior. Click a filename in the diff view and it opens VS Code. Sounds reasonable, except it doesn't open in your existing VS Code window—it spawns a completely new instance. Click two files? Two separate VS Code instances, each hogging memory like they're competing for a prize.

The Settings page demonstrates another oddity. It looks fine until you navigate to the Skills tab, where "the right side's gray background disappears, and the new skill thing starts to go out of boundaries." The padding throughout feels off, like someone eyeballed the spacing instead of using a design system.

The Comparison That Matters

AICodeKing specifically compares Codex to Verdant, a competitor that apparently nailed the agentic UI: "You get the default Aentic interface where you can easily create workspaces, skills, and stuff and run multiple tasks in multiple repos while taking up extremely less memory."

That comparison crystallizes what's strange here. The GUI format for AI coding assistants isn't unsolved territory. Other companies have figured out how to build interfaces that feel native, work reliably, and manage resources efficiently. Verdant demonstrates that it's possible. So why does OpenAI's version feel like a rough draft?

One possibility: OpenAI might be moving faster than their engineering can sustain. When you're racing to stay ahead in AI, polish becomes optional. Ship now, fix later. That works for research previews, but feels different when you're charging users and competing against tools that already work smoothly.

Another angle: maybe the comparison to Conductor isn't coincidental. If you're building something that looks a lot like an existing tool, you might be playing catch-up rather than innovating. That changes the pressure dynamics—you're trying to match features quickly rather than build something genuinely differentiated.

What Free Users Get (For Now)

The free tier deserves attention because it's generous by AI standards. One month of full access is substantial—enough time to build real projects, test workflows, and decide if it's worth paying for. The rate limits for paid users (2x the free tier) suggest OpenAI wants people actually using this, not just trying it once.

But here's the tension: giving away access while the core functionality is still buggy is a gamble. If free users hit these issues hard, they're not converting to paid. They're going to Verdant or staying with whatever they're already using.

The Bigger Pattern

I keep thinking about that "vibecoded" comment. It's funny, but it also names something real about how AI companies are shipping products right now. There's this implicit understanding that everything is still experimental, that users will tolerate rough edges because the underlying capabilities are so novel.

That worked when AI coding tools were barely functional. It works less well when competitors are shipping polished experiences and your main advantage is brand recognition.

The Codex desktop app isn't bad, exactly. It works (mostly). It adds genuinely useful features like automation and background tasks. The diff viewing experience is apparently quite good. But "mostly works" and "quite good" aren't the standards we typically apply to products from trillion-dollar companies competing in crowded markets.

AICodeKing's verdict lands somewhere interesting: "Overall, it's pretty cool." Not "this is the future of coding." Not "everyone should switch immediately." Just... pretty cool. For OpenAI's first native desktop app for Codex, launching into a market where they're no longer the only option, that tepid endorsement might be the most revealing thing about where this product actually landed.

—Yuki Okonkwo