Building a Serverless AI Agent with Pi and Google

There's a recurring fantasy in developer circles: the personal AI that knows your business, lives in the cloud, costs you almost nothing to run, and answers to no one but you. No SaaS subscription. No vendor reading your data. No pricing tier that inexplicably doubles when you hit a threshold.

ZazenCodes, a programming-focused YouTube channel, published a tutorial last week doing its best to make that fantasy practical. The video—28 minutes of live terminal work, architecture diagrams, and a few detours through Toronto barbershop receipts—walks through building a serverless AI bookkeeping agent using a framework called Pi, deploying it to Google Cloud Run, and getting it working on a phone. The whole thing, from empty folder to cloud-hosted app, happens inside a single session.

What's interesting about the approach isn't really the bookkeeping part. It's the architecture choices, and what they reveal about where DIY AI tooling currently sits.

The Stack, Unpacked

Pi is the load-bearing piece here that most developers won't have heard of. The tutorial describes it as an "agent harness"—the infrastructure layer that wraps a language model and handles memory, skills, and configuration so you're not rebuilding those primitives from scratch every time. The underlying model (in this case, Anthropic's Claude via API) is essentially a plugin. Pi handles the rest.

"Pi is everything surrounding the language model," the creator explains. "It's like the infrastructure that makes it accessible to us."

The rest of the stack is deliberately mundane: an Express HTTP server for the chat frontend, Google Cloud Run for serverless container hosting, and Google Cloud Storage for persistence. That last choice deserves some attention.

Serverless architectures have a fundamental problem with state: when the function spins down—which happens automatically when nobody's using it, which is the whole cost-saving point—any data stored inside it vanishes. For a chatbot that's logging your expenses, that's fatal. The solution here is to mount a Cloud Storage bucket using GCS Fuse, a filesystem interface that lets the container treat cloud object storage like a local directory. Cloud Run has native support for this. The tutorial notes this as a discovery in real time: "Honestly, guys, I did not even know about this."

That candor is either refreshing or concerning depending on what you're hoping for. The tutorial isn't documentation—it's someone building in public, learning as they go. That has real value. It also means readers following along will need to validate decisions independently.

Claude Does the Heavy Lifting

The most technically revealing portion isn't the deployment itself—it's watching the creator use Claude Code to write the deployment code. The prompt engineering is actually thoughtful. Rather than asking for a generic Express app, the creator constructs a detailed specification: target file structure, Pi configuration, memory schema (facts.json, expenses.json, an images directory), skills registry, Node version pinning, even a stylistic preference for dark mode. The initial build is described as a "one-shot": one prompt, one complete scaffold.

The catch is that Claude doesn't know what Pi is. It's not in the model's training data—Pi is too recent and too niche. So the creator instructs Claude to research Pi before writing any code, which it does by examining the locally installed package. That's a useful pattern for working with rapidly evolving frameworks: tell the model to read the source before writing against it.

The cloud deployment phase uses a similar approach, but with higher stakes. The creator grants Claude access to the authenticated gcloud CLI and instructs it to provision infrastructure. Secrets get stored in Google Secret Manager. A service account gets created. The container gets configured with one CPU and a gigabyte of memory. Claude orchestrates all of this through prompted terminal commands.

"It's using one CPU, it's using a gig of memory. This will have implications on cost. And now we're adding volumes."

The creator turns off "bypass permissions" before this phase—meaning Claude has to ask before running each terminal command. That's the right call when an AI is provisioning real cloud resources. Small typos in infrastructure commands can be expensive.

What Actually Works

By the end of the tutorial, the stack does what it promises. A receipt photo dragged into the browser becomes a structured expense entry in a JSON file sitting in a cloud storage bucket. The same flow works from a phone. The agent correctly categorizes items, asks clarifying questions when categories don't fit cleanly, and persists everything between sessions.

The mobile demo is the satisfying part. Open the Cloud Run URL in Safari, authenticate with a shared password, tap "Add to Home Screen," and you have something that looks and behaves like a native app—except the intelligence is serverless Claude and the storage is Google Cloud, and you control both. For someone tired of paying $20/month for a notes app with "AI features," there's a real appeal here.

"This video will give you an agent that you can use, pretty much. That's what I'm telling you."

The creator is not wrong. But the tutorial is also honest about what's left undone.

The Honest Gaps

Authentication is the most obvious weak link, and the tutorial acknowledges it repeatedly. The security model is a single shared password stored in Google Secret Manager—no multi-factor authentication, no per-user sessions, no audit logging. The creator flags this multiple times: "I'm not advocating you to actually set this up... You can harden the authentication later, basically."

That "harden it later" posture is endemic to developer tutorials, and it deserves some scrutiny. For a personal productivity tool that never leaves your pocket, shared-password auth is probably fine. For anything that touches real financial data and might be accessed from unfamiliar networks, it isn't. The tutorial gestures at the gap without filling it.

There's also the question of cost visibility. The tutorial mentions implications without quantifying them. Google Cloud Run is genuinely cheap for low-traffic personal tools—you can often stay within free-tier limits—but the combination of Cloud Run, Cloud Storage, and Secret Manager, plus Anthropic API costs per token, means the economics depend heavily on usage. Someone following this tutorial should do their own math before treating it as a free alternative to commercial tools.

The version-control discipline section is actually one of the stronger parts. The tutorial shows exactly how to use .gitignore to keep actual expense data and receipt images out of the repository while preserving example files for onboarding. That's a detail that beginners routinely get wrong, and getting it right from the start matters.

The Larger Question

What this tutorial really demonstrates is that the barrier to deploying a personal AI agent has dropped far enough that a motivated developer can do it in an afternoon. That's genuinely new. A year ago, this required either accepting a commercial vendor's terms or maintaining a significant amount of infrastructure glue yourself.

The Pi framework abstracts a lot of the hard parts. Claude Code reduces the amount of infrastructure knowledge you need to hold in your head. Google Cloud Run handles the scaling and availability that used to require a DevOps team. The combination is meaningfully accessible in a way that similar stacks weren't before.

What hasn't changed is that "accessible" and "production-ready" are still different categories. The tutorial builds something real and functional—but it's a sketch of a system, not a finished one. The creator knows this. The more interesting question is whether the frameworks and tools are now mature enough that turning a sketch into something hardened is a weekend project rather than a month-long one.

Based on what's in this tutorial, the honest answer is: almost.

Marcus Chen-Ramirez is a senior technology correspondent for Buzzrag covering AI, software development, and the intersection of technology and society.